Alignment Annotator annotates and renders sequence alignments.
Coping with huge numbers of annotations
Retrieval of information from annotation services and 3D-structure files
Optimized layout of underlined residue selections for compactness
Sequence alignment interlinked with 3D structure visualization
All major web browsers (except IE below 10)
All operating systems (Windows, Apple, UNIX, iOS, Android)
3D-Visualization is currently based on Java. Modern Chrome and MS-Edge have stopped supporting Java
MS-Word, Libre-Office, Open-Office. This allows for further editing for example to create figures for publications.
Desktop applications: Jalview, Strap
Alignment Annotator does not depend on plugins or Java.
The resulting HTML document is interactive and can be included in web pages.
The start page has a text area where
the sequences can be entered in different ways.
Aligned sequences. All standard sequence and alignment formats are supported. Gaps must be written as dashes.
Sequences without gaps. All standard multiple sequence file formats are supported. In this case the system will align the sequences automatically.
References (URL or database id) for alignment documents, sequence files or PDB structure files.
Amino acid sequences and nucleotide sequences are supported.
Nucleotide sequences are assumed, if all sequences are composed of the letters A, C, T, G and N,
unless, the alignment type is explicitly set with the script command set_alignment_type_P in the first script text.
Residue annotations from bioinformatics services and 3D visualization are only available for amino acid sequences.
The sequence alignment display and its user interface
After submitting the sequences, the alignment is displayed.
It is interactive and allows:
Changing order of sequences by dragging the mouse
Hiding of sequences and annotation by dragging into the trash
Balloon messages and context panels on sequences and annotations.
The menus and tool-bars are initially hidden and become visible by clicking .
There are two levels:
The top panel outside the black frame is a classical web form.
It allows changing those features of the alignment that require computation at server side.
Changes take only effect after pressing the Upload button.
The area surrounded by a black frame is an embedded HTML frame inside the main HTML page.
The user interface responds immediately without accessing the server.
documentation of the alignment view.
Translation into peptide sequences:
nucleotide sequences are provided, the amino acid sequences can be predicted
with the script command translate_cds.
In this case the amino acid sequence alignment is shown.
Example translating the first to the 99th nucleotide:
translate_cds 1..99 , *
Example translating the first to the last nucleotide:
The submitted sequences are aligned on the server, unless at least one of the sequences contains the gap character (dash, "-").
In most cases, computation is performed with ClustalW.
More accurate methods like T-Coffee can be selected
with the script command use_aligner in the first script.
Mixed sequence / 3D structure alignment can improve alignment quality
of remote homologs and is conducted if structure data is loaded at the
time of alignment computation for at least two sequences. This is the case if
the input data contains PDB IDs
or if PDB files are loaded by script commands in the first script text
or if the second script contains an align-command while mapping of homologous structures is enabled.
3D structure alignment is time consuming. By default, the program TM-align is used which can align two structures.
If more than two structures are provided, all n times (n-1) pair alignments are compiled by the server program.
The advantage of performing each pair alignment is, that each intermediate result is stored in the cache and interrupted computations can be resumed.
Conversely, structure alignment methods which naively align more than two structures often yield better results and can be selected with the
script command use_aligner3D, however only the final result is stored in the cache and interrupted computations are lost.
For time consuming computations, please use the locally running Strap program (Java) rather than the Alignment Annotation server.
The result of 3D-alignment is finally used to align also those sequences without 3D-structure using ClustalW and T-Coffee.
If the computation time exceeds a certain amount of seconds, it may be stopped by requests submitted later by the same or by other users.
This will be inidicated in the browser page.
The user can resumed the computation by reloading the page or
change it to Long-Computation-Mode which will run much longer before it can be interrupted by another job.
The disadvantage is that if the queue is not empty, it can take long before the job is going to be processed.
Residue annotations are either shown as colored background or underline of residues.
All attached information can be inspected by opening the context panel (Right-click).
On touch screens, the cursor can be roughly positioned with the finger and moved exactly on the annotated residue with two arrow buttons.
When the cursor lies on an annotated residue, another button becomes active which opens the context panel.
Explicit definition: Annotations can be defined
explicitly using GFF-format with tabulator or vertical bar as field separator typed into the text box
in Change > Annotations > Own.
The nine fields are:
Sequence | Source leave empty | Name | Start | End | Score | Nucleotide Strand | leave empty | Attributes.
The data can be typed directly into the text-field, using the tabulator key for automatic sequence name completion.
It can also be prepared in a spread-sheet program like MS-Excel. To get the lines from Excel to the text-field, use Ctrl-C / Ctrl-V for copy and paste.
For example the following line will select residue 24 of the specified sequence:
Colors are written as Red-Green-Blue Hexadecimal triplets and can be set in the field Attributes:
If the amino acid sequence is translated from a coding nucleotide sequence as described above then the positions can refer to the coding nucleotide sequence by typing "+" into field 7.
The attribute Hide=true combined with the default style (UNDERLINE) deactivates the residue selection. It can be activated (and de-activated) with a toggle-button after the sequence line.
There are the following enhancements over the standard GFF format:
If Start equals Endi.e. only one position is selected, the field End can be omitted.
If the value of End is a plus sign followed by a number, the end position is the start position plus that number.
Start can contain a complex expression allowing for non-consecutive sequence positions. In this case End must remain empty.
For example 10-20,30-40 selects residues 10 to 20 plus 30 to 40.
As in Rasmol/Jmol, 20:-30: refers to PDB residue numbers 20 to 30. The chain ID after the colon can be omitted.
Lines starting with let are variable assignments. Variables can be used for frequently used text fragments to reduce the amount of typed text.
let $t=Hello world
Annotations can also be defined with script commands which is, however, more verbose than the GFF notion.
Retrieval from services:
The UniProt sequence features and Catalytic Site Atlas residues are directly stored on the server and will be quickly available.
The BioDAS services, however are loaded from remote servers which causes some delay.
The default BioDAS services are cbs_total, netphos and netoglyc.
Any BioDAS residue annotation is available with script command DAS_features name, sequences.
Limitations: Currently, only UniProt centered BioDAS-annotations are supported.
Advanced program features not accessible by the graphical user interface require
All project data can be downloaded to HD such that the project can be continued any time on any Alignment Annotation server.
In addition, a Strap script file can be downloaded which allows continuation of the project using the desktop program Strap.
A Zip file can be downloaded. It has all required files for embedding the alignment in web sites.
Using Alignment Annotator as a view option for sequence alignments for other web services