On Fri, 1 Apr 2005, Goel, Manisha wrote: >Hi All, > >I need to parse the blast ouput to get the taxonomy information. >If I could get the taxonomy nodes associted with each gi number .. This >would also work. Yeah, this data is here... ftp://ftp.ncbi.nih.gov/pub/taxonomy/ See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/gi_taxid.readme "The gi_taxid_prot.dmp is about 17 MB and contains two columns: the protein's gi and taxid." You can then use the 'taxdump' to get the names.dmp (for the names) and nodes.dmp (for the structure of the taxonomic tree) files (if you need them). See... ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdump_readme.txt All the best, Dan. >I have been trying SEALS taxonomy commands but somehow quite a few >sequences turn up "not_retrieved", although we have tried updating the >database etc. >I do not want to use the BLAST web server because I have too many files >to run. >Please suggest any program/script that might be useful. > >Thanks, >-Manisha >