[BiO BB] Re: [ssml] Parsing taxonomy from blast output
idonalds at blueprint.org
Fri Apr 1 15:11:35 EST 2005
I should also mention that you can retrieve this information using the
SeqHound remote Perl API (or Java/C/C++).
No need to use up disk space or wait for downloads.
The call is SHoundTaxIDFromGi described here:
You can download the API from here:
and follow the enclosed instructions to get started or look at the first few
pages of the SeqHound Manual
Taxid assignments to Gi's are updated daily as part of the core module.
Other API calls can also provide you with names of taxons.
bio_bulletin_board-bounces+idonalds=blueprint.org at bioinformatics.org
[mailto:bio_bulletin_board-bounces+idonalds=blueprint.org at bioinformatics
.org]On Behalf Of Dan Bolser
Sent: April 1, 2005 12:40 PM
To: Goel, Manisha
Cc: ssml-general at bioinformatics.org;
bio_bulletin_board at bioinformatics.org; pdb-l at sdsc.edu
Subject: [BiO BB] Re: [ssml] Parsing taxonomy from blast output
On Fri, 1 Apr 2005, Goel, Manisha wrote:
>I need to parse the blast ouput to get the taxonomy information.
>If I could get the taxonomy nodes associted with each gi number .. This
>would also work.
Yeah, this data is here...
"The gi_taxid_prot.dmp is about 17 MB and contains two columns: the
protein's gi and taxid."
You can then use the 'taxdump' to get the names.dmp (for the names) and
nodes.dmp (for the structure of the taxonomic tree) files (if you need
All the best,
>I have been trying SEALS taxonomy commands but somehow quite a few
>sequences turn up "not_retrieved", although we have tried updating the
>I do not want to use the BLAST web server because I have too many files
>Please suggest any program/script that might be useful.
Bioinformatics.Org general forum - BiO_Bulletin_Board at bioinformatics.org
More information about the BBB