Alignment Annotator

  cd /home/people/christo/public_html/strap/AA2/tmp/apache/x4bdeded715_bd4e/; less -R stdout.txt 
Author: christoph.gille   charite.de   Institut für Biochemie, Charité, Berlin

Summary

Alignment Annotator annotates and renders sequence alignments. Program features: The resulting HTML document is interactive and can either stand alone or be included in other web pages.

Entering sequences

The start page has a text area where the sequences can be entered in several different ways. See examples by activating the check-box Sample input data: Amino acid sequences and nucleotide sequences are supported. Nucleotide sequences are assumed, if all sequences are composed of the letters A, C, T, G and N, unless, the alignment type is explicitly set with the commands set_alignment_type_N or set_alignment_type_P in the first script text. Annotations from services and 3D visualization are only available for amino acid sequences.

Translation into peptide sequences:

If coding nucleotide sequences are provided and amino acid sequences are predicted with the script command translate_cds in the first script text, an amino acid sequence alignment is shown. Example translating the first to the 99th nucleotide:
 translate_cds  1..99 , *
Example translating the first to the last nucleotide:
 translate_cds  1.. , *
Example with intron spanning position 32 to 103:
 translate_cds  complement(20..31,104..222) , sequenceName
Instead of an explicit CDS expression a specific protein name of a given Embl or GenBank formated file can be entered. The translate_cds command should be entered into the first script text such that alignment computation acts on amino acid sequences.
If the translate_cds command was in the 2nd script text, the alignment would be computed for the nucleotide sequences and then the nucleotide sequences would be translated.

Alignment computation

In most cases, server side alignment computation is performed with ClustalW. More accurate methods like T-Coffee can be selected with the script command use_aligner in the first script. Mixed sequence / 3D structure alignment can improve alignment quality of remote homologs and is conducted if structure data is loaded at the time of alignment computation for at least two sequences. This is the case if 3D structure alignment is time consuming. By default, the program TM-align is used which can align two structures. If more than two structures are provided, all n times (n-1) pair alignments are compiled by the server program. The advantage of performing each pair alignment is, that each intermediate result is stored in the cache and interrupted computations can be resumed. Conversely, structure alignment methods which naively align more than two structures often yield better results and can be selected with the script command use_aligner3D, however only the final result is stored in the cache and interrupted computations are lost. For time consuming computations, please use the locally running Strap program (Java) rather than the Alignment Annotation server. The result of 3D-alignment is finally used to align also those sequences without 3D-structure using ClustalW and T-Coffee.

The alignment view

After submitting the data to the server, the rendered and interactive alignment will be shown in an embedded frame. It may happen that the job prematurely terminates due to timeout. In this case, the computation can be resumed by reloading the browser page. The alignment view will look like:

The alignment view has a graphical user interface which is independent of the server. It allows: It is described in detail in documentation of the alignment view.

Annotations

Following the GFF syntax, annotations are entered into the text field in Change > Annotations > Own. The tabulator key triggers word-completion for sequence names.

Explicit definition: Annotations can be defined explicitly using GFF-format with tabulator or vertical bar as field separator.
The nine fields are: Sequence | Source leave empty | Name | Start | End | Score | Nucleotide Strand | leave empty | Attributes.
For example the following line will select residue 24 of the specified sequence:
seqNameOrNumber|.|Modified residue|24|24|.|.|.|
Colors are written as Red-Green-Blue Hexadecimal triplets and can be set in the field Attributes:
seqNameOrNumber|.|Modified residue|24|24|.|.|.|Color=#00ff00
Otherwise a table of feature names and colors is used which can be edited with the command feature_colors.
 
feature_colors Modified_residue=#00AA00 Phosphoserine=#00ff00
If there is no matching entry, the default color is used. The following specifies the color, balloon message, style and 3D style:
seqNameOrNumber|.|Modified residue|24|24|.|.|.|Color=#00ff00; Balloon="hello world"; 3D_view=spheres; Style=BACKGROUND
If the amino acid sequence is translated from a coding nucleotide sequence as described above then the positions can refer to the nucleotide sequence by setting the field Nucleotide Strand to "+". A minus sign, denoting the reverse complementary strand in GFF is not supported. The attribute Hide=true combined with the default style (UNDERLINE) deactivates the residue selection, which can be activated with a check-box.
There are the following enhancements over the standard GFF format: Lines starting with let are variable assignments. Variables can be used for frequently used text fragments to reduce the amount of typed text. Example:
let $t=Hello world
Annotations can also be defined with script commands which is, however, more verbose than the GFF notion.

Retrieval from services: The UniProt sequence features and Catalytic Site Atlas residues are directly stored on the server and will be quickly available. The BioDAS services, however are loaded from remote servers which causes some delay. The default BioDAS services are cbs_total, netphos and netoglyc. All are available with the script command DAS_features name, sequences. The selection of the color is described above.
Limitations: Currently, only UniProt centered BioDAS-annotations are supported.

Scripts

Advanced program features not accessible by the graphical user interface require script commands.

Download/Save

A Zip file can be downloaded. It has all required files for embedding the alignment in web sites. It also contains the URL for opening Alignment Annotator and a Strap file for opening the alignment in Strap.

Embedding in web services

Read here about the Programming interface.
Alignment Annotator can be employed for visualization in Bioinformatics web services.
Acknowledgements