About
The W-curve takes a genome sequence and utilizing an auto-regressive
family of iterated functions transforms it into a three dimensional plot where
each vertex corresponds to a single base. This geometrical visualization of a
sequence can be used to locate patterns that would be extremely difficult or
impossible to detect using traditional character string based methods.
W-curves of the human and fetal hemoglobin genes reveal a much longer
duplication than that shown by Blast N and Genbank analysis.
It would appear that the W-curve is a useful algorithm for visually:
- Categorizing long strings of repetitive DNA
- Locating topological consensus among aligned homologous, long
repetitive
- Analyzing the molecular evolution of gene families.
The W-curve presents a compact visual picture for rapidly curating and
analyzing long repetitive genomic sequences. W-curves may also be used to
achieve consensus of and strengthen proposed conjectures and models relating
to recombination events involving long repetitive genomic sequences.
Purpose/Goals:
-
To provide researchers with a software suite that will allow them to
convert genome sequences into W-curves and visualize long genomic repeats
and other long patterns embedded in long genomic sequences. No longer is
one restricted to a Blast N search, yielding only relatively short lengths
of similar sequences.
-
Provide a comprehensive database of W-curves that will be available when an
investigator wants to search for similar patterns, align them, and
phylogentically tree them.
The software is written for visualization of patterns that have been
accurately aligned with insertions, deletions and gaps for comparison with
similar sequences in different organisms. Once the investigator visually spots
his long embedded pattern, tools are available for clustering the long genomic
sequences.