Main»Imperfect Repeats

Imperfect Repeats

Imperfect repeats

= microsatellites containing mutations (substitution, insertion, deletions)

most programs compute imperfect repeat under 1) the Hamming distance model (only substitutions), others apply 2) the Edit distance model (substitutions, instertions, deletions)

e.g. 1) ACACACATACA or, 2) ATGCATGCTATGCATGCATGC

similarity (or number of mutations) can be measured in several ways:

- every n-th nucleotide is a mismatch
- as average alignment score of all copies to a computed consensus sequence
- as average alignment score of two neighbouring copies over the entire array
- as average alignment score of two random copies within the array

programs (crux)

- search strategy: heuristic?!, modelling, autocorrelation
- filter: statistical criteria, E-value, p-value, biological significance