[BiO BB] Constructing a multiple aligner, similar to Smith-Waterman

Tue Feb 14 03:20:25 EST 2006

> That is, the trace has multiple highest cells within it, two 13's. Do
> we use the shorter back trace?

There is nothing special about either score. You should choose the first
or last one you see (whichever is easiest). Usually there is very little
difference between them, although I have seen perfect repeats in proteins
which would each align with a single repeat in another sequence :-)

> OK, so let's assume we got a single alignment done. How do we then
> find the other sections to align? Would a process of elimination do
> it? That is, search the remaining matrix (Excluding the portion
> already aligned) for the highest cell?

Yes, that's how Smith-Waterman-Eggert works. Check the original paper for
the rules (if memory serves, zero all cells that contributed to the
previous alignment and then look for the highest remainning score)

One thing nconcerns me a little ... you mentioned "multiple alignment".
Local (Smith-Waterman) alignments will do the best matches, but you need a
strategy for the remainder of each sequence. Depending on your project
this could be anything from a global alignment to throwing the rest away
(alignments looking for protein domains do this, for example).

Hope this helps,

Peter