Multiple sequence alignment

From Bioinformatics.Org Wiki

Jump to: navigation, search

Two approaches to multiple sequence alignment (MSA) include progressive and iterative MSAs. As the names imply, progressive MSA starts with one sequence and progressively aligns the others, while iterative MSA realigns the sequences during multiple iterations of the process.




  1. Start with the most similar sequence.
  2. Align the new sequence to each of the previous sequences.
  3. Create a distance matrix/function for each sequence pair.
  4. Create a phylogeneticguide tree” from the matrices, placing the sequences at the terminal nodes.
  5. Use the guide tree to determine the next sequence to be added to the alignment.
  6. Preserve gaps.
  7. Go back to step 1.

Progressive MSA is one of the fastest approaches, considerably faster than the adaptation of pair-wise alignments to multiple sequences, which can become a very slow process for more than a few sequences.

One major disadvantage, however, is the reliance on a good alignment of the first two sequences. Errors there can propagate throughout the rest of the MSA. An alternative approach is iterative MSA (see below).


For iterative MSA, the MSA is re-iterated, starting with the pair-wise re-alignment of sequences within subgroups, and then the re-alignment of the subgroups. The choice of subgroups can be made via sequence relations on the guide tree, random selection, and so on.

At heart, iterative MSA is an optimization method and may use machine learning approaches such as genetic algorithms and Hidden Markov Models. The disadvantages of iterative MSA are inherited from optimization methods: the process can get trapped in local minima and can be much slower.


See also

Further reading

Personal tools
wiki navigation