l x yi lxyiwc at yahoo.com
Fri Aug 27 12:44:03 EDT 2004

I was reading the paper "Maximum Likelihood
Estimation of the Statistical distribution of
Smith-Waterman Local Sequence Similarity Scores",
published in Bulletin of Mathematical Biology Vol54,
No1, p59-75 by R. Mott, and the paper "Sequence
Comparison Significance and Poisson Approximation"
published in Statistical Science vol9, 1994,
367-381,by M.S.Waterman et al. These articles discuss
the ML method for estimating parameters for
distributions obtained by searching a random sequence
with a databank.It can be assumed that the sequences
in the databank are independent sequences, but if we
are using the same sequence as query each
time,wouldn't the scores obtained be dependent? then
the likelihood needs to be modified to account for
this dependency? both of these papers used the
likelihood assuming all the scores obtained are
independent. Am I missing something here? Could
someone please help me understand? Thanks very much. 


