[BiO BB] All-again-all protein sequence comparison

Ian Donaldson idonalds at blueprint.org
Fri Dec 17 09:37:46 EST 2004


Dear Anne

There is a pre-computed BLAST of all pairwise proteins in the NCBI's nr
database available at

ftp://ftp.blueprint.org/pub/SeqHound/Data/NBLAST/

These results are also available via a remote API (in Perl/Java/C/C++).

You can read http://www.blueprint.org/seqhound/seqhound_documentation.html

for how to get started with this API if it meets your needs.

Best regards

Ian

-----Original Message-----
From: bio_bulletin_board-bounces at bioinformatics.org
[mailto:bio_bulletin_board-bounces at bioinformatics.org]On Behalf Of Iddo
Friedberg
Sent: December 16, 2004 4:47 PM
To: The general forum at Bioinformatics.Org
Subject: Re: [BiO BB] All-again-all protein sequence comparison



Use ncbi toolkit, write a script around bl2seq for the all-vs-all.

If the genomes are really large, I would try and cluster each genome
first at 90% Sequence ID, to remove redundancies, using CD-HIT.

I wouldn't go with the strategy of having  one genome as a database, and
another as a query pool, because that would skew your BLAST statistics
to give you false-positive hits. I would go with the all-vs-all pairwise
BLAST.

./I


Dr. Christoph Gille wrote:

>the ncbi toolkit works well.
>I can loop over all proteins in one genome
>and run blast against the other.
>
>
>
>
>>Hi, All
>>
>>
>>I have been working on obtain the BLAST e-score for all-against-all
>>protein sequences of two genomes. Is there is tool for script for this
>>function? Any suggestions will be helpful.
>>
>>Thanks,
>>
>>
>>Anne_______________________________________________
>>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>>
>>
>>
>>
>
>
>_______________________________________________
>BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>
>


--

Iddo Friedberg, Ph.D.
The Burnham Institute
10901 N. Torrey Pines Rd.
La Jolla, CA 92037
Tel: (858) 646 3100 x3516
Fax: (858) 713 9930
http://ffas.ljcrf.edu/~iddo

_______________________________________________
BiO_Bulletin_Board maillist  -  BiO_Bulletin_Board at bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bio_bulletin_board




More information about the BBB mailing list