Home

Forming Web-links for protein files and alignments

This page describes how Web-links for the protein and alignment viewer STRAP are formed using the web variables load or align. Clicking these Web-links in a web browser opens amino acid sequences, protein structures or multiple sequence files in STRAP. The web variables load= or align= or alignAndRearange= contain one or several protein references. A protein reference may be an URL or an Database-ID followed by colon and an entry ID. Unless loaded with the variable "load=" proteins will be aligned after loading. For this purpose the 3D-superposition program TM-align and the sequence alignment program ClustalW are combined. With "alignAndRearange=" the proteins are reordered according to sequence similarity.

Additional web variables provide further options:

Additional information for the protein entries

The database reference or URL of the protein can optionally be followed fields separated by "|" (vertical bar). Note: The percent encoding of "|" is %7C. Strictly speaking, "|" should be written as %7C in URLs. But apparently web browsers tolerate if "|" is not properly encoded.

Uniprot Examples

Complex Uniprot Expressions

GenomeNet (Kegg) Examples

Entrez Examples

EMBL or Genbank nucleotide example

EMBL and Genbank files have a nucleotide sequence block. Coding sequences (CDS) are defined by an enumeration of nucleotide positions of the form
FT   CDS             join(25240..25717,29079..29174,31348..31417,39382..39809,
or in case of reverse complement
FT   CDS             complement(5226515..5227132)
This expression is used to compute the amino acid sequence. The following examplifies how this expression can be changed or how the n-th CDS can be selected.

Ensembl (under reconstruction)

PDB-Examples

Proteins with nucleic acid:

Setting the biological unit:

The matrices which are applied to the can be specified in the 5th field in form of a bit-mask given as a hexadecimal number. For example 8 means the 4th matrix as the binary representation of 0x8 is 00000001000. Minus 1 denotes the asymmetric unit.
-1     all matrices     1(wrong, not existing)     2     3     4     8     10     20     40     10000(wrong, not existing)    

Hetero-Compounds, DNA, RNA:

PDB files often contain non-peptide structures such as flavine or NADH and DNA/RNA structures which are treated in the following way: Those hetero compounds that share the chain identifier together with a peptide are added to the respective peptide object. This will be indicated by a vertical green (nucleotide) and red (heteros) bar of the protein labels in the alignment row headers. But if the hetero compound has a chain of its own, then things are more complicated:

SCOP- Examples

PFAM Examples

Prodom Examples


Example with direct Web address

Instead of referring to a protein by database-colon-accession-ID, a crude Web address of the Protein file can be used. Special characters of the URL like the two slashes in "http://" must be percent encoded.

Technical details

The client computer needs Java version 1.5 or higher. The links in this document point to a jnlp file. The jnlp file must be opened by the browser with the program bin/javaws which is part of the Java system. Occasionally, browsers fail to locate this program. In these cases the user needs to find the location of javaws on the hard-disk. See Browser settings.

External applications

Failure to download:

Occasional, protein entries are removed from databases and are not available any more. What happens if STRAP tries to download a non-existing file or when the server is not responding? The result depends on the server response. Here some examples of non-existing entries:

Time consuming alignments:

Frame size and location

The location of the application frame is specified with the option geometry=width x heigth + offsetX + offsetY following the .
  1. geometry=400x300+100+100 This means width=400 height=300, Position at pixel 100,100.
  2. geometry=400x300+200+100
  3. geometry=500x300+100+100
  4. geometry=500x300-100+100 Negative offsetX refers to the right screen margin.