[Biodevelopers] Blast not symmetrical?

Malay mbasu at mail.nih.gov
Thu Jan 18 12:15:44 EST 2007


Michael Nuhn wrote:
> Hello, Everybody!
> 
> While I was trying to track down a "bug" in my program I found out that the
> blast program (Blastn v2.2.11) is not symmetrical, that is:
> 
> If I blast a query sequence Q against a database S (1 sequence), I get a
> result set B(S,Q).
> 
> If I do the blast the other way around, that is, I use S as query sequence
> and blast it against the database Q, I get a result B(Q,S).
> 
> And the problem is: B(S,Q) and B(Q,S) are not equal. Each blast set has some
> blast hits that the other does not have and also some blast hits that have
> one common coordinate but end at another.
> 
> Both blasts were made with the blast defaults, no filter was used. The two
> sequences are large (~2Mb each, the sequences are genomes). According to the
> statistics used in blast (at least the part I understand), it should not
> play a role which sequence is the query and which is the subject.
> 
> Does anyone have an explanation for this? Since I don't really have a clue
> at where to start, hints and wild guesses are also appreciated.
> 
> Thanks in advance,
> Michael.
> 
> _______________________________________________
> Biodevelopers mailing list
> Biodevelopers at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biodevelopers


This is a very well known phenomenon. You can read more about it here:

http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html

Why this appears to be counter-intuitive is because people mistake BLAST 
scores (a sort of bit value) with alignment score. The raw alignment 
score will be same for both the alignments. Doesn't matter which one is 
query. But Blast bit score is then calculated from the raw alignment 
score taking into consideration of background distribution of the amino 
acids of the query sequence. Because, the query sequences differs in 
composition the  bit-scores will be different (and the e-value, which is 
calculated from the bit-score). I guess that's a over-simplified 
explanation.


-- 
Malay K Basu
www.malaybasu.net


More information about the Biodevelopers mailing list