[BiO BB] All-again-all protein sequence comparison
idonalds at blueprint.org
Fri Dec 17 09:37:46 EST 2004
There is a pre-computed BLAST of all pairwise proteins in the NCBI's nr
database available at
These results are also available via a remote API (in Perl/Java/C/C++).
You can read http://www.blueprint.org/seqhound/seqhound_documentation.html
for how to get started with this API if it meets your needs.
From: bio_bulletin_board-bounces at bioinformatics.org
[mailto:bio_bulletin_board-bounces at bioinformatics.org]On Behalf Of Iddo
Sent: December 16, 2004 4:47 PM
To: The general forum at Bioinformatics.Org
Subject: Re: [BiO BB] All-again-all protein sequence comparison
Use ncbi toolkit, write a script around bl2seq for the all-vs-all.
If the genomes are really large, I would try and cluster each genome
first at 90% Sequence ID, to remove redundancies, using CD-HIT.
I wouldn't go with the strategy of having one genome as a database, and
another as a query pool, because that would skew your BLAST statistics
to give you false-positive hits. I would go with the all-vs-all pairwise
Dr. Christoph Gille wrote:
>the ncbi toolkit works well.
>I can loop over all proteins in one genome
>and run blast against the other.
>>I have been working on obtain the BLAST e-score for all-against-all
>>protein sequences of two genomes. Is there is tool for script for this
>>function? Any suggestions will be helpful.
>>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
>BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930
BiO_Bulletin_Board maillist - BiO_Bulletin_Board at bioinformatics.org
More information about the BBB