[BiO BB] How to calculate the value of K and Lambda for two sequence alignment

Maximilian Haeussler maximilianh at gmail.com
Sun Feb 12 07:50:53 EST 2006


I have no clue :-) but I typed "smith waterman k lamda" into google
and clicked on "I'm feeling lucky":

got the NCBI page where blast is explained
http://www.people.virginia.edu/~wrp/cshl02/Altschul/Altschul-3.html:

---------------
K and lambda are statistical parameters dependent upon the scoring
system and the background amino acid frequences of the sequences being
compared. While FASTA estimates these parameters from the scores
generated by actual database searches, BLAST estimates them beforehand
for specific scoring schemes by comparing many random sequences
generated using a standard protein amino acid composition [12].
   For example, using BLOSUM-62 amino acid substitution scores [13],
and affine gap costs [14-16] in which a gap of length k is assigned a
score of -(10 + k), we generated 10,000 pairs of length-1000 random
protein sequences, and used the Smith-Waterman algorithm to calculate
10,000 optimal local alignment scores. From these scores, lambda was
estimated at 0.252 and K at 0.035 by the method of maximum-likelihood
[17]. In general, given M samples from an extreme value distribution,
the ratio of the maximum-likelihood estimate of lambda to its actual
value is approximately normally distributed, with mean 1.0 and
standard deviation 0.78/sqrt(M) [17]. Thus the standard error for our
estimate of lambda is about 0.002, or less than 1%.
---------------

>From what I understood, you generate many alignments, plot the
generated scores for the current matrix, assume that they follow your
function E and then approximate lambda.

The addison wesley BLAST book goes into details and gives an example
PERL program to calculate lambda and says that the value of k doesn't
really matter:
(this sample chapter is free)
http://www.oreilly.com/catalog/blast/chapter/ch04.pdf

Don't know if it helped or if it is completelywrong, it took 10
minutes and I found it interesting... :-)

Max


On 11/02/06, Ryan Golhar <golharam at umdnj.edu> wrote:
> Did anyone ever respond to you on this?  K and lambda.  I forget where K
> comes from.  Lambda is dependent on the scoring matrix you are using.  I
> believe it is given with the matrix.  BLOSUM uses 0.347.
>
> Ryan
>
>
> -----Original Message-----
> From:
> bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org
> [mailto:bio_bulletin_board-bounces+golharam=umdnj.edu at bioinformatics.org]
> On Behalf Of shohag md
> Sent: Friday, February 03, 2006 6:26 AM
> To: bio_bulletin_board at bioinformatics.org
> Subject: [BiO BB] How to calculate the value of K and Lambda for two
> sequence alignment
>
>
>
> Hi Everybody
>
>
>
> Using Smith Waterman algorithm I want to align two sequences. Aftet that I
> want to calculate the expectation value. For calculating the expectation
> value we know that
>
>
>
> E = Kmn e - lx
>
>
>
> But how can I calculate the value of K and l .
>
>
> Is there any formula that can help me to calculate the value of K and l ,
> and then the expectation value.
>
> Thanking all in advance
>
> Shoyaib
>
> _______________________________________________
> Bioinformatics.Org general forum  -
> BiO_Bulletin_Board at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>


--
Maximilian Haeussler,
CNRS Gif-sur-Yvette, Paris
tel: +33 6 12 82 76 16
icq: 3825815  -- msn: maximilian.haeussler at hpi.uni-potsdam.de
skype: maximilianhaeussler



More information about the BBB mailing list