cd /home/people/christo/public_html/strap/AA2/tmp/apache/x4bdeded715_bd4e/; less -R stdout.txt
Author:
christophgillecharitede
Institut für Biochemie, Charité, Berlin
Summary
Alignment Annotator annotates and renders sequence alignments.
Program features:
Annotations
Coping with huge numbers of annotations
Retrieval of information from annotation services and 3D-structure files
Optimized layout of underlined residue selections for compactness
3D
Sequence alignment interlinked with 3D structure visualization
Compatibility
All major web browsers (except IE)
All operating systems (Windows, Apple, UNIX, iOS, Android)
Tablet support
Export
MS-Word, Libre-Office, Open-Office. This allows for further editing for example to create figures for publications.
Clustal, Fasta
Alignment Annotator does not require Java.
The resulting HTML document is interactive and can either stand alone or be included in other web pages.
Entering sequences
The start page has a text area where
the sequences can be entered in several different ways. See examples by activating the check-box Sample input data:
As an alignment. All standard sequence and alignment formats are supported. The gaps are denoted by dash.
Sequences without gaps. All standard multiple sequence file formats are supported. In this case the system will align the sequences automatically.
Reference (URL or database id) for alignment documents, sequence files or PDB structure files.
Amino acid sequences and nucleotide sequences are supported.
Nucleotide sequences are assumed, if all sequences are composed of the letters A, C, T, G and N,
unless, the alignment type is explicitly set with the commands set_alignment_type_N or set_alignment_type_P in the first script text.
Annotations from services and 3D visualization are only available for amino acid sequences.
Translation into peptide sequences:
If coding
nucleotide sequences are provided and amino acid sequences are
predicted with the script command translate_cds in the first
script text, an amino acid sequence alignment is shown.
Example translating the first to the 99th nucleotide:
translate_cds 1..99 , *
Example translating the first to the last nucleotide:
Instead of an explicit CDS expression a specific protein name of a given Embl or GenBank formated file can be entered.
The translate_cds command should be entered into the first script text such that alignment computation acts on amino acid sequences.
If the translate_cds command was in the 2nd script text, the alignment would be computed for
the nucleotide sequences and then the nucleotide sequences would be
translated.
Alignment computation
In most cases, server side alignment computation is performed with ClustalW.
More accurate methods like T-Coffee can be selected
with the script command use_aligner in the first script.
Mixed sequence / 3D structure alignment can improve alignment quality
of remote homologs and is conducted if structure data is loaded at the
time of alignment computation for at least two sequences. This is the case if
the input data contains PDB IDs
or if PDB files are loaded by script commands in the first script text
or if the second script contains an align-command while mapping of homologous structures is enabled.
3D structure alignment is time consuming. By default, the program TM-align is used which can align two structures.
If more than two structures are provided, all n times (n-1) pair alignments are compiled by the server program.
The advantage of performing each pair alignment is, that each intermediate result is stored in the cache and interrupted computations can be resumed.
Conversely, structure alignment methods which naively align more than two structures often yield better results and can be selected with the
script command use_aligner3D, however only the final result is stored in the cache and interrupted computations are lost.
For time consuming computations, please use the locally running Strap program (Java) rather than the Alignment Annotation server.
The result of 3D-alignment is finally used to align also those sequences without 3D-structure using ClustalW and T-Coffee.
The alignment view
After submitting the data to the server, the rendered and interactive alignment will be shown in an embedded frame.
It may happen that the job prematurely terminates due to timeout.
In this case, the computation can be resumed by reloading the browser page.
The alignment view will look like:
The alignment view has a graphical user interface which is independent of the server. It allows:
Changing order of sequences by dragging the mouse
Hiding of sequences and annotation by dragging into the trash
Following the GFF syntax, annotations are entered into the text field
in Change > Annotations > Own. The tabulator key triggers
word-completion for sequence names.
Explicit definition: Annotations can be defined
explicitly using GFF-format with tabulator or vertical bar as field separator.
The nine fields are:
Sequence | Source leave empty | Name | Start | End | Score | Nucleotide Strand | leave empty | Attributes.
For example the following line will select residue 24 of the specified sequence:
seqNameOrNumber|.|Modified residue|24|24|.|.|.|
Colors are written as Red-Green-Blue Hexadecimal triplets and can be set in the field Attributes:
If the amino acid sequence is translated from a coding nucleotide sequence as described above then the positions can refer to the nucleotide sequence by setting
the field Nucleotide Strand to "+". A minus sign, denoting the reverse complementary strand in GFF is not supported.
The attribute Hide=true combined with the default style (UNDERLINE) deactivates the residue selection, which can be activated with a check-box.
There are the following enhancements over the standard GFF format:
If Start equals Endi.e. only one position is selected, the field End can be omitted.
Start can contain a complex expression allowing for non-consecutive sequence positions. In this case End must remain empty.
For example 10-20,30-40 selects residues 10 to 20 plus 30 to 40.
As in Rasmol/Jmol, 20:-30: refers to PDB residue numbers 20 to 30. The chain ID after the colon can be omitted.
Lines starting with let are variable assignments. Variables can be used for frequently used text fragments to reduce the amount of typed text.
Example:
let $t=Hello world
Annotations can also be defined with script commands which is, however, more verbose than the GFF notion.
Retrieval from services:
The UniProt sequence features and Catalytic Site Atlas residues are directly stored on the server and will be quickly available.
The BioDAS services, however are loaded from remote servers which causes some delay.
The default BioDAS services are cbs_total, netphos and netoglyc.
All are available with the script command DAS_features name, sequences.
The selection of the color is described above.
Limitations: Currently, only UniProt centered BioDAS-annotations are supported.
Scripts
Advanced program features not accessible by the graphical user interface require
script commands.
Download/Save
A Zip file can be downloaded. It has all required files for embedding the alignment in web sites.
It also contains the URL for opening Alignment Annotator and a Strap file for opening the alignment in Strap.