Highlighting sequence patterns
[Alignment-Contextmenu>View] This dialog is used to underline all occurrences of sequence
patterns.
The table contains three check-boxes to toggle visualization on and
off for the sequence alignment window, the horizontal scrollbar, and
the 3D-views.
A 4th check-box indicates whether the search string is applied to
the amino acid sequence or nucleotide sequence.
The latter option is only applied if the amino acid sequence is
translated from nucleotide sequence.
The search string complies with UNIX regular expressions.
We explain briefly the notation with some useful
examples.
Many search queries can be entered in one line and must be separated by white space.
- "FSP" is a simple search string identifying all
occurrences of "FSP" in the amino acid sequence. Each position is
highlighted in the specified color.
- "F.P" leaves the second residue undefined since dots match any
character.
- "IFSP|TFSP" is equivalent to "[IT]FSP" or "(IF|TF)SP" or "IFSP TFSP" and
highlights all occurrences of IFSP and TFSP. Vertical bars separate
alternatives.
- "(.....)\1" selects all penta repeats. A backslash-number is a
reference to a parenthesized group.
- "[FYW][FYW]" selects adjacent aromatic amino acids. Square
brackets enclose groups of letters at one single sequence
position.
- "CGT|CGC|CGA|CGG|AGA|AGG" highlights all codons for
Arg. Searching the underlying nucleotide sequence requires the
check-box "n" not to be checked
- "[ke]{3,99}"highlights all occurrences of 3 to 99 continuous
positions containing either Lys or Glu.
The Check-boxes in the first table columns determine, where the matches are shown.
- 1D Highlight matches in alignment panes
- 3D Highlight matches in 3d-Backbone views
- SB Highlight matches in the scroll-bar
- NT Highlight patterns in nucleotide sequence. The aa-sequence of the protein must be translated from nt-sequence
list of blank separated reg-expressions
case insensitive
E.g. '[AG]..[KR]' matchas all occurrence of
Ala of Gly in one position
and Lys or Arg at the position n+3
The two dotes match any.");
See ion for a more detailed description.
One table row may contain several expressions separated by white space.
Each expression will create a DIALOG:ResidueSelection
for each protein.