Preparing document for printing:
When printing, backgrounds are automatically turned to white.
Introduction
[Menu-bar>Tutorials] Did you ever wish to compare protein sequences or to identify 3D
structures of proteins that are similar to your protein?
Are you interested in remote phylogenetic relationships to discuss the function of a protein?
Do you want to take advantage of the experiences of the Bioinformatics community to
explore your proteins?
Making sequence alignments of related protein sequences is a common task in biology.
It reveals regions which are highly conserved and are functionally important.
Automatic approaches alone are often not fully satisfactory and manual
refinement is necessary in most cases particularly when the sequences
are very dissimilar.
STRAP is a
comfortable and comprehensive tool to edit multiple protein sequence
alignments.
A wide range of functions related to protein sequences and
protein structures are accessible with an intuitive graphical
interface.
STRAP is tightly integrated into your desktop environment supporting
Cut_paste and Word_completion and spell check.
Drag_and_drop is available for proteins, nucleotide structures and hetero structures.
Context_menu for proteins, annotations and files are triggered by right mouse click.
The program appears to be complicated in the beginning.
With the help of the integrated tutorials you will learn how to
apply the currently available methods to compare proteins sequences
and structures.
The key features are:
- Visualization and manipulation of sequence alignments (up to 1000 sequences)
- Automatic computation of multiple sequence alignments by Clustalw (Fukami-Kobayashi & Saito, 2002) and many other algorithms.
It can combine amino acid sequences alignments and protein structure alignments.
- Loading protein files from public databases
- BLAST searches
- Structure prediction
- 3D-visualization using either PyMol, Rasmol, VMD, JMol or ProteinWorkshop
- 3D-superposition of C-alpha atoms
- Dot-plots
- High quality PDF output by LaTeX/TeXshade
- Project safety by included backup system
- Translation of nucleotide sequences to amino acid sequences
- Residue selections
- Highlighting Regular_expression
- Residues adjacent to ligands
- Text annotations of residues including:
- Notes
- Hyperlinks
- 3D-rendering commands
-
Optimized performance
- Reduced memory consumption
- Extremely fast loading of proteins
- Cache for computed results such as alignments and Blast-results
What it cannot do:
- Docking
- Gene structure, promotor analysis
- Structure modelling
- Molecular dynamics
- Gene expression
- RNA-3D-structure
Online documentation
Tutorials
Help
Java Source_code is available in advanced user mode.
Frequently occurring buttons
Setting of program parameters such as Gap_penalty
Customization such as addresses for protein files
Detach from main window and display in a separate window

Show/hide
Setting color
Closing a view usually without loss of data
Discard
Report bugs to:
christoph.gille(at)charite.de
Alignment Projects in STRAP
Each alignment project is stored in one folder.
This file directory is specified by the
user at the beginning of the session and cannot be changed during
the session. The complete directory path appears in the title bar
of the STRAP application frame.
Loading protein files: Protein files are loaded into STRAP
by dragging one or more files from the desktop or any other location
into the STRAP-application.
Web-pages may contain links to protein databases.
These links can be dragged from the Web-browser into STRAP
(
http://www.bioinformatics.org/strap/dragProteinLink.html for details).
Several STRAP-sessions can be opened at the same time and proteins can be dragged from one to the other.
Context menus
Context menus in STRAP:
The term Context_menu is commonly used for menus which pop up when right
clicking (
right mouse button) an item in a graphical user interface,
offering a list of options which vary depending on the item selected.
On Macintosh computers there is sometimes only one mouse button and the right mouse button can be simulated by ALT+CTRL+left-click.
Context menus are available for:
- Protein labels which are found in the row header of the alignment panel and in lists of proteins.
- Residue selections which are highlightings of one or several amino acids within a protein.
- Residue annotations which are residue selections with associated text. Sequence features are a type of residue annotations.
- Files
- Alignment panel.
- Rectangular region in the alignment panel which is marked by marching_ants.
- Hetero compound structures which are protein ligands like FAD, NADH, DNA or RNA.
Selecting single items:
List items are selected simply by left-click the respective node.
Single residue selections in the alignment panel can be selected by clicking with the CTRL key.
Selecting more than one item:
Selecting more than one list item requires the
SHIFT and
CTRL keys.
The CTRL key is located at the lower left of the key board and is sometimes labeled
STRG and
the SHIFT key is sometimes termed "UMSCHALT".
By dragging a rectangular region in the alignment, all contained
residue selections are selected. With the SHIFT or CTRL key the
union-set or cut-set is formed with the set of selections inside the
rectangular region.
The tool-button

of the tree-panel allows to select items according to
text matches.
Frequently used menu items can be dragged out the menu and placed on the desktop.
The tree view is located at the left of the application.
It is usually hidden and can be opened by dragging the vertical divider bar.
The tree contains all loaded proteins and their child objects.
Menu bar of STRAP
Note: The at-sign @ indicates that a menu item is only visible when certain check boxes in the User profile are activated.
Context menu of annotations
Context menu of selections
Context menu of files
User profile
To avoid that users are confused by the number of menu items and options
only those GUI elements are shown that the user is interested in.
The sequence alignment panel
The central view of the STRAP application is the alignment panel. It
shows the names and sequences of loaded proteins. The user can
add and remove white space to
align the sequences, so-called alignment gaps.
Usually, only one sequence alignment panel is shown but additional
alignment views can be opened
New alignment panel[Menu-bar>View] .
In any case only one alignment is present in one STRAP instance.
Nevertheless, STRAP can be run several times in parallel and data
exchange between STRAP sessions is conducted by Drag-n-Drop.
Shading:
Three residue shadings are available: "charge", "hydropathy" and
"chemical" (tool-bar below the alignment).
Alternatively, secondary structure can be highlighted: Helices are
painted red and sheets yellow.
Further GUI controls are in the context-menu.
Editing: Manipulation of the multiple sequence alignment is
performed with the keyboard.
STRAP has many sophisticated keyboard commands.
Please see
Keyboard.
Cursor: The cursor position is highlighted in all views with the following symbol:

.
Usually only the alignment row containing the cursor is changed when
for example a gap is introduced by pressing the space bar.
It is often necessary to add white space to
a number of sequences and not only to one sequence.
The number of adjacent rows edited simultaneously can be set by
hitting "#" and subsequently typing a number.
Stacking proteins: Usually, each row contains exactly
one protein. The proteins can be
dragged up or down in the row
header to change their order. A number of proteins can be
stacked into one single row to be manipulated simultaneously
while only the sequence of one protein is shown:
Stacking proteins into a single line ....
Residue selections are highlighted in the multiple sequence
alignment. When the mouse is over a highlighted residue a Tooltip
with additional information appears. Right-click opens the
context meny and CTRL+left-click selects the selection. It can also be selected by dragging a rectangular region.
Scrolling:
The horizontal
scroll-bar outlines the entire alignment and shows selections and
plots. It can be enlarged with the mouse.
If there are many proteins the vertical scroll-bar is visible.
But also the horizontal
scroll-bar can scroll vertically.
Try mouse wheel turning with and without SHIFT and CTRL.
Keyboard
Alignment gaps are introduced and removed with the keyboard.
Changes to the amino acid sequence, however, are conducted with an external
editor. ([Menu-bar>Protein>File] ).
Cursor Movements
| Arrow keys | Cursor navigation |
| CTRL arrow keys | move to beginning or end |
| ALT + arrow keys | move to next gap |
| HOME | First row |
| END | Last row |
| SHIFT HOME | First column |
| SHIFT END | Last column |
Movement of the view-port
| PAGE-UP / PAGE-DOWN | Scroll up and down |
| SHIFT PAGE-UP | Scroll left |
Deleting gaps
| DELETE | Delete gap right from or under cursor |
| BACKSPACE | DELETE gap left from cursor |
| CTRL DELETE | Delete next gap right from the cursor |
| CTRL BACKSPACE | Delete next gap left from cursor |
| CTRL SHIFT DELETE | Delete entire white space under cursor |
| CTRL SHIFT BACKSPACE | Delete entire white space left from cursor |
Inserting gaps
| SPACE | Insert gap under cursor |
| INSERT | Insert gaps until next residue in above row is reached. Effectively it copies the gaps from the line above. |
| CTRL INSERT | Remove gaps until previous residue in above row is reached |
| SHIFT INSERT | Insert gaps until next residue in below row is reached |
Perform an operation n times
| 4 2 SPACE | Insert 42 gaps |
| 4 2 DELETE | Delete 42 gaps |
Moving a sequence blocks: A group of consecutive residues that are not interrupted by gaps is moved left or right.
| > | Move continuous group of residues right |
| < | Move continuous group of residues left |
Window
| CTRL N | New alignment panel |
| CTRL W | Close alignment panel |
Miscellaneous
| CTRL K | Close protein |
| CTRL * | Display letters in better quality |
| U | Upper Case |
| L | Lower Case |
Font Size
| CTRL "+" | Zoom in |
| CTRL "-" | Zoom out |
Goto position
| 4 2 i | Move cursor to residue index 42 |
| 4 r | Move cursor to 4th row |
| 42 c | Move cursor to column 42 |
| 42 n | Move cursor pdb-number 42 |
The mouse actions follow general conventions:
-
Sequence alignment
- left-click sets the cursor.
- middle-click sets the focus without changing the cursor.
- right-click opens a context menu either for the alignment pane or for a residue selection.
- Dragging creates a mouse selection.
- Dragging over more than one sequence creates a rectangular region.
Residue selections and annotations within the rectangle are selected.
SHIFT for union-set and CTRL for cut-set similar to the program gimp or the MS-Window desktop.
- ALT + Dragging scrolls two-dimensionally. Under Unix/Linux also hold SHIFT
-
Row-header
- Dragging up/down changes the order of proteins.
- CTRL left-click selects or un-selects the protein.
- right-click opens a context menu for the protein.
-
Scroll-bar or alignment pane
- Wheel scrolls.
- SHIFT + Wheel scrolls vertically
- CTRL + Wheel zooms.
Sequence groups
| G x | Define all currently selected proteins as group "x". |
| g x | Load proteins of sequence group "x" if not already loaded. Select these proteins. |
| h x | Unselect and hide all proteins of sequence group "x". |
Sequence Groups:
Sequences can be grouped. Groups are designated by a digit or letter.
The names of all proteins of a group are listed in a file in ./annotations/sequenceGroups/.
Alignment gaps
When there are gaps occurring in all loaded sequences at the same position they are usually not displayed.
Consider the following case:
ASGATA YTG
ATGATG YTA
ASGGTAG FSG
^
This is a common gap which is not displayed. The view shows:
ASGATA YTG
ATGATG YTA
ASGGTAGFSG
When the sequence "ASGGTAGAFSG"
was added to the multiple sequence alignment the gap would not be a common gap any more and would become visible.
In the multiple sequence alignment panel is the following:
ASGATA YTG
ATGATG YTA
ASGGTAG FSG
ASGGTAGAFSG
This is the reason for the following phenomenon:
When gaps are added to one sequence it might seem that gaps are erased
in all other sequences instead.
Open protein files
[Menu-bar>File] The file selector shows the protein files of the current project
directory. They are ordered in different lists according the file endings.
Protein files can be selected and loaded into STRAP by double
clicking or with the button

. Multiple selections
are possible with the keys
CTRL (same as STRG) and
SHIFT.
Drag-And-Drop: The file selector is capable of Drag_and_drop:
Files from the native file browser or desktop can be dragged into the
file selector or into to the STRAP alignment panel (see
See
http://www.bioinformatics.org/strap/dragProteinLink.html).
Alternative directories: In the choice menu at the top
a number of other folders can be chosen. For example a
local mirror of the PDB-file collection may given.
Flags: The flags
g,
s,
3 and
n which precede the file names indicate
the existence of gaps, residue-annotations, 3D coordinate transformations and nucleotide reading
frames, respectively.
Multiple sequence files with the ending .msf or .clustalw contain the aligned sequences of several proteins: Clustalw: NexusFormat.
Customization: The list of file endings and the list of alternative directories can be altered.
File formats:
Many
* file formats are recognized: Swissprot, Protein_Data_Bank, Fasta, Genbank, EMBL.
See
http://www.molecularevolution.org/mbl/resources/fileformats/.
The file text is analyzed and the file type identified without considered the name of the file.
If no appropriate protein file type is detected then the sequences is formed from the
letters in the file.
Drag-and-drop
Actions which are performed when objects are dropped on certain targets.
| D r o p p e d O b j e c t s |
| Drop target |
Proteins |
Residue selections |
Hetero groups |
Images |
Proteins |
|
Copy selection to protein |
Add heteros to protein |
Set icon image |
Residue selections |
|
|
|
Set background image |
3D-backbone |
Show 3D-structure |
|
Show 3D-structure |
|
Protein selection list |
Set selected proteins |
|
|
|
Alignment panel |
Load or un-hide protein |
Copy selection to protein |
Add heteros to protein |
|
Tree-node "Hidden" |
Hide proteins |
|
|
|
Desktop or file browser |
Copy protein file |
|
Write PDB-file |
|
Objects are dragged with the mouse and dropped on a target (see Drag_and_drop). The effect depends on the objects that are dragged and the
destination where they are dragged onto. On Windows and Linux the presumed operation is written at the bottom of the Strap frame.
The page
Dragging Web-links contains examples of protein and alignment Web links that can be dragged.
Translate Genbank nucleotide files
[Menu-bar>Protein>Nucleotide sequence]
1 AAACATGGCG CTGGCTAGCG TGTTGCAGCG ...
51 ACGGGTTTTT TGGGCTCGGA GGTGGTGCAG ...
101 GGGAGTCCTG GTGATGGGCT GAGCCTAGCC ...
...
The Genbank file format and Embl file format are widely used file
format for annotated nucleotide sequences (format specifications:
http://www.ncbi.nlm.nih.gov/Sitemap/samplerecord.html and
ftp://ftp.ncbi.nih.gov/genbank/gbrel.txt and
http://www.ebi.ac.uk/embl/Documentation/User_manual/usrman.html).
When a Genbank file is loaded into STRAP the entire nucleotide
sequence will be extracted and appear in the multiple sequence
alignment pane.
These nucleotides can be translated into
amino acids using the CDS-sequence-features within the protein file.
The Genbank file contains fields with the keyword "CDS" (coding sequence).
The text following "CDS" tells the translation direction and the translated nucleotides.
The sequence may be continuous or interrupted by introns.
The following example of a CDS field is from the file NCBI_NT:5757659.
CDS 5. .799
/gene="Psmb5"
/codon_start=1
/product="proteasome subunit X"
/protein_id="AAD50536.1"
/db_xref="GI:5757659"
/translation="MALA ...VSVP"
If the Genbank file contains several genes the user can chose one of them.
Sometimes Genbank files already contain the translation in a field
/translation. The sequence computed by STRAP should be identical to
this sequence.
When the project is saved to disk an additional file will be created with
the ending ".dna" coding the translation direction (forward/reverse
complement) and the translated and un-translated positions.
3D-backbone view of proteins
The simple 3D-viewer visualizes the C-alpha trace of one or several proteins as a polygon.
Helices may be drawn in red and sheets in yellow.
Mouse Actions:
- CTRL+drag or mouse-wheel: rotate in screen pane
- CTRL+mouse wheel or CTRL-plus: zoom. (This is consistent with office programs).
- clicking an C-alpha atom: moves alignment cursor to the clicked residue
- CTRL + clicking: toggles selection
- SHIFT + clicking: selects range
On clicking

the model is viewed in the original coordinates as recorded in the
protein file.
Residue Selections:
- Amino acids can be selected and un-selected by clicking an amino acid in the 3D-structure with SHIFT or CTRL.
To delete the selection the selection-menu in the tool-panel is used.
The tool-panel is shown when the check-box
is activated.
- Amino acid selections generated in other parts of the program are usually also highlighted in 3D.
Color-chooser:
The color chooser is shown if

is activated.
Four color-schemes are available.
- Each chain in a different color (striped box)
- Secondary structure: helices red, sheets yellow (yellow-red box)
- Entire protein in one color (monochrome boxes)
- Protein entirely hidden (black box)
With the CTRL key the color of all proteins can be changed at once.
Adding proteins to the 3D-view:
Proteins are added by Drag_and_drop or with the context menu.
Context menu (right-click)
- By default, the cursor column is highlighted
even when the cursor is in a different protein.
This can be changed.
- Several proteins can be loaded into one view.
Advanced 3D-viewer Pymol
The Pymol-button at the top of the 3D-backbone view allows to load
the proteins into the
Protein viewer Pymol. Pymol is a widely applied
3D-viewer. It is stable, fast and has many functions which are
accessible via a graphical user interface and a command line
interface. If STRAP fails to start Pymol, the debug window might be helpful.
To obtain this debug window the "Pymol"
button is pressed with the CTRL key.
To remove a protein from an external protein viewer press
[Menu-bar>Protein>3D] .
Identifying similar proteins
The fold of a protein is much more conserved in evolution than the
amino acid sequence.
Therefore it is possible to identify remote homologs for a given protein by looking for structures
with the same fold

regarding only the c-Alpha atom positions.
For this purpose four Web-servers are contacted:
- GangstaPlus* 17118190
- CE_CL* 11125099 9796821
- Dali* 8578593
- NCBI-VAST* 8804824
For more details see
Structural neighbors ...
Protein viewer Pymol
PyMOL
http://pymol.sourceforge.net/ is an excellent and widely applied protein
structure viewer. It has been developed by Warren Delano and
many tools and scripts are added by the scientific community.
Residue selections: Dragging the mouse over a part of
the amino acid sequence in the STRAP alignment creates a
residue selection in Pymol.
The STRAP cursor position refers to another Pymol selection.
Users can refer to these selections to show
[S] or hide
[H] representations of residues like
sphere,
cartoon or
stick.
For example to show all atoms of the amino acid at the current cursor position as spheres
open the corresponding [S] menu of the selection "cursor" and select "spheres".
Picking: Picking an atom in Pymol moves the
alignment cursor to the respective residue in the alignment.
This can be switched off: [Menu-bar>Options>Proteins]
Removing a protein from Pymol: If a protein
would be removed in Pymol this would be unnoticed by STRAP.
Therefore it is recommended to use the STRAP user interface and not the Pymol user interface to remove proteins.
There are two possibilities:
- The STRAP menu item [Menu-bar>Protein>3D]
- The STRAP object tree which is visible at the left of the STRAP application frame in advanced user mode.
Pymol commands:
Pymol commands (see
http://pymol.sourceforge.net/newman/ref/S1000comref.html) are typed directly into the Pymol-window.
Most important commands:
- reset
Zooms the window and clipping planes to cover all objects.
- ray
Creates a ray-traced image of the current frame.
- png my_file.png
Writes an image file.
- bg white
White background.
- delete myProtein
Deletes an object. Understands wildcard
Related links: http://www.rubor.de/bioinf/
http://www.rubor.de/anlagen/PyMOL_Tutorial.pdf
Residue Selections
Residue selections are objects attached to proteins which select one or more amino acid positions from the amino acid sequence.
They usually have color, style attributes and a balloon message. They may be shown in
the alignment (
1D) and in the built in 3D-wire-frame
(
3D) and the scrollbar view-port (
VP). Selections are created in different ways:
- Mouse selection: Dragging the mouse in the alignment pane selects a continuous chain of residues which may be pasted e.g. into a BLAST web form.
- The dialog Highlighting sequence patterns ....
All occurrences of the patterns will be highlighted in the selected color.
- 3D-Visualizations: Clicking a 3D-backbone atom creates selections.
- Some dialogs create selection objects:
- Dotplot ....
- Identify residues with certain structural features ....
- User extension: Users familiar with
Java-Programming may implement their own algorithms. See Write plugins by yourself ....
- Residue annotations:
Annotations are residue selections that can contain text attributes. They are saved to HD.
Simple selections can easily be transfered into residue annotations
using
[Residue-selection-contextmenu]
- Sequence features:
Sequence features like glycosylation are residue annotations obtained from remote servers.
Annotated residue selections
[Menu-bar>Tutorials] Residue annotations, allow the assignment of information to specific
amino acids or nucleotides of proteins.
They are special types of residue selections.
A residue annotation has a list of entries each having a name, some
user typed text and a toggle button.
The user can change these entries and add new entries.
The entries can contain several types of information:
- Rendering commands for 3D-viewers,
- LaTeX-code for the PDF-output,
- Hyperlinks, PUBMED-references, database references,
- Notes, for example literature citations, experimental conditions.
A ResidueAnnotation can be created in different ways:
- A residue selection can be created by dragging the mouse over the sequence. Right-click opens the context menu of the residue
selection. With the menu item
[Residue-selection-contextmenu] a
residue annotation can be created that has the same amino acid
positions.
- The alignment cursor is positioned on a residue selection. The
tool-button
below the alignment
pane is used to create a residue annotations.
- The cursor is placed on the residue under consideration and the tool-button
is pressed.
Residue annotations which are highlighted in the alignment can be selected in two different ways: By CTRL left-click or by dragging a rectangular box.
Selected residue annotations have Marching_ants.
Sequence Features:
Features like phosporylation sites and active sites are special
annotations which are retrieved from a computational service and are
therefore not saved to hard disk. Their color cannot be changed.
Background Images
Image icons are a visual aid to recognize proteins and residue
annotations. They are associated to proteins or annotations by Drag_and_drop. Supported formats are gif, png, bmp and jpg.
Images in Web pages within the Web browser can only be dragged if they
do not serve as a hyper-link.
Obtaining icons:
Species Icons:
Some protein files have a record of the species name.
These species names can be used to set a suitable icon for the selected proteins:

.
Structure Icons:
To download the 3d-structure thumb-nails from the PDB and use them as icons for the selected proteins,
press this button:

.
To delete the icons of all selected proteins press this button:

.
Mapping of species and icons: A table that maps species to images can be inspected:

To see a list of currently loaded images press

.
If you want files from a particular species be generally depicted with a
certain icon you can copy the respective line to the following file:
~/.StrapAlign/speciesIcons.txt .
To deactivate an icon type "NONE" instead of an image file name.
ResidueSelectionPopup.html
Modifying annotations
To edit a
* residue annotation The
item

of the
context menu must be activated. Alternatively, the alignment
cursor can be placed on a selected residue and the tool-button

clicked. The button is
located below the alignment when "annotations" is activated in the

.
The annotation view contains a table. Each row is one annotation
entry.
These entries can be modified and are saved when the alignment is saved.
Four obligatory
fields Name,Group,Positions are unique
and mandatory whereas the fields
Note,
Remark,

and

may occur zero to several times.
-
The most important field is
"Positions"
because it defines the indices of the selected residues.
Here are some examples of valid entries
-
The expression "1,3,4,6,101-103,110-112"
selects the residue indices 1,3,4,6,101,102,103,110,111,112.
-
The expression
"+2 1,3,4"
selects the residues 3,5, and 6 because the preceding +2 adds an offset of 2 to all positions. The "+"-sign is the first character in this expression.
For the user interface the first residue has the index "1" whereas internally STRAP starts counting at zero.
Negative offsets are achieved with a "-" rather than a "+".
-
The expression
"1:G,3:G,4:G"
selects the residues with the pdb-numbers 1 3 and 4 of the chain G. Instead of the chain identifier G you might enter an asterisk as a wild card.
When the toggle button is pressed (default state) the indices refer to amino acid positions.
Otherwise the positions indicate nucleotides in case the protein is translated from a nucleotide sequence.
- Name Each selection has a name. Un-checking the check box deactivates the residue annotation.
- Group Several selections may be bundled in one group e.g. "active site"
: TeXshade is a LaTeX-extension for
exporting an alignment as PDF or PostScript. This field is for TeXshade rendering commands. Active when the check-box is
selected.
,
and
:
Rasmol, VMD, Pymol and Jmol are protein structure viewers. 3D-rendering commands can be typed.
On pressing the icon-buttons the commands are send to the viewers.
- Note, Remark: Free text can be typed here.
Clicking on the hyperlink icon opens the specified URLs in the Web-browser.
There are two ways to specify an URL:
- By typing an URL directly
- By using a database prefix and a key as for example PDB:1ryp, PUBMED:0815. The list of databases can be changed if the CTRL key is hold while the mouse is clicked.
Note: free text
Remark:free text
Export proteins
[Menu-bar>File]
Dragging proteins with the mouse
Proteins can be dragged from the alignment to the Desktop or the file browser.
This requires dragging a protein label such as in the row header of
the alignment with the mouse.
Proteins can be dropped in other Desktop applications that support Drag_and_drop.
They may also be dropped into another STRAP instance.
Usually, the protein file, i.e. the file that the protein was loaded from, is transfered.
But in some cases, the user might want not this original file but a different or modified protein file:
A panel with several export options is available. The button appears at the right bottom when the user attempts to drag out a protein.
Using the export dialog (file-menu)
Single protein files can be exported in various output formats.
The output format is selected using a combo box.
Export PDF
[Menu-bar>File] Multiple sequence alignment alignments can be exported to PDF and PostScript with the
LaTeX
extension
TeXshade
written by Eric Beitz (see LaTeX).
For simple alignment output the
Export alignment ... is easier to use.
Applying TeXshade requires some basic knowledge about type setting.
Novices may have a look at the tiny introduction
* .
Installation
A LaTeX system is required.
LaTeX provides the two shell commands "latex" and "pdflatex".
Windows users may use the LaTeX in Cygwin or Miktex.
Macintosh users may install Texshop.
Exporting the sequence alignment to PDF
The main card of the TeXshade-Dialog has three buttons

,

and

which need to be pressed one after the other.
On pressing

two files are generated: a
* LaTeX-file and a multiple sequence file.
As soon as both files exist the button

becomes active.
When it is pressed the program
pdflatex starts to work on these two files.
The result is a PDF-document which will be displayed
when the button

is pressed.
Including the Alignment in Text-processor documents
PDF graphics can usually not be included into MS-Word or Openoffice documents.
But PostScript figures can. The check-box to toggle PDF and PostScript
becomes accessible by clicking

with the
CTRL key.
STRAP will use the command
latex instead of
pdflatex.
The output is a
.ps file which needs to be converted to
.eps as
described in "How do I convert PostScript to EPS?" of page
http://www.postscript.org/FAQs/language/ .
Annotations
Residue annotations can contain one or more fields labeled with

.
Those entries usually contain the variables "PROTEIN" and "RESIDUES"
which are replaced by the positions and the protein number.
A residue annotation can be created using the tool-button

below the sequence alignment
pane.
A detailed description is found on the TeXshade home page.
Plotting Numeric values computed for each residue with a class
ValueOfResidue can be plotted along the sequence alignment.
Classes implementing
ValueOfResidue have a
getValues-method which returns a numeric value for each residue.
The user chooses a class in the "plotting"-card

.
The plot has four alternative locations:
top,
bottom,
ttop,
bbottom.
Fixing LaTeX-errors
Like in any programming language PDF-LaTeX stops when the text
contains syntax errors. In this case cryptic messages are written to
stderr and no PDF is produced.
LaTeX reports the line number where the error occurred like
1.42 meaning that an error was found at line 42.
Typically the error is an incorrect TeXshade command assigned by the
user to a residue annotation. The most frequent errors are unbalanced
parentheses or characters that have a special syntactical meaning like
underscore or "%".
Customization: In the customize dialog you can specify the command for pdflatex and for the pdf viewer.
Memory limitation:
With large alignments the pdflatex-run may terminate with an
... memory exceeded ...-error.
No PDF-output would be generated.
If this happens the LaTeX heap size must be increased.
Miktex users should set
pdf_mem_size to a higher values in the file
miktex.ini.
For other LaTeX systems the memory settings in
texmf.cnf must
be increased ( e.g. multiplied by 10) and the program
fmtutil
run as root.
Typical locations of the
texmf.cnf are /etc/texmf/texmf.cnf /usr/share/texmf/web2c/texmf.cnf and
/usr/local/teTeX/share/texmf/web2c/texmf.cnf.
# fmtutil --byfmt=pdflatex
Publish alignment as a Web-link
[Menu-bar>File]
With this dialog a Web-Link can be formed which loads the proteins from the public databases into STRAP and displays an alignment.
When the link is clicked on a computer with Java, the alignment will be displayed in STRAP.
Depending on the purpose of the link and on how much information should be transfered
two different types are available:
- A compact URL
- A Web form
In both types, proteins that are stored in protein databases are
included by database reference. This has the advantage that always the most recent
version of the protein file is loaded and the current state of
sequence features and cross links are available.
Single URL
Since the URL contains the information in a compact form in a single line, not all information is stored.
The advantage is that the URL can not only be included in web-pages, but also in e-mails, Office-documents.
The generation of the URL is conducted in two steps:
-
The parameter String is written into the first text-field and can be modified by the user.
The following table summarizes the "|"-separated fields.
Vertical_bar separated fields of protein entries
| No | Description | Example |
| 1 | URL of protein file or database colon ID | EMBL:M57965 |
| 2 | Protein name. Optionally with exclamation mark and residue subset. | b_myosin_heavy_chain |
| 3 | Icon | http://www.ebi.ac.uk/thornton.../duc_temp.gif |
| 4 | Residue selections. Supported 3D-styles: sticks, spheres and dots. | #00FFFF,sticks,16-20,#FF00FF,spheres,40-50 |
| 5 | Coding sequence of a nucleotide sequence | reverse,40-100 |
The 5th field is required only for nucleotide sequence files. It contains either the index of the CDS such as "#1" for the first or "#2" for the second CDS or the
CDS expression directly.
-
From the text in the first text-field the web-link will be generated using Url_encoding and written into the 2nd text-field.
The generated URL acts as a hyperlink and can be tested by clicking. A new STRAP session will be opened in web-mode and an alignment will be loaded with the specified information.
Web form
Since the web form has no size limitation, the entire information for the alignment can be included.
The draw-back is that it can only be included in web-pages, but not in office documents or e-mails.
A minimal html-page including the web link for the selected proteins is generated on pressing

.
For testing it is loaded into the web browser upon pressing

.
From this html code the text between the opening and closing and <body> tags can be used in any html-page.
An overview of the STRAP scripting commands is given in
http://3d-alignment.eu/strap_script.html.
The following commands are available:
accession_id ,add_annotation ,add_xref ,align ,align_bg ,below_row ,biomolecule ,box ,cds ,close ,close_jmol ,close_pymol ,close_wire ,cursor ,deiconify,delete ,gaps ,hide ,icon ,iconify,jmol ,load ,new_nucleotide_selection ,new_residue_selection ,plugin ,pymol ,remove_xref ,rotate_translate ,scroll_to ,select ,seqvista ,jalview ,spice ,set_annotation ,superimpose ,superimpose_bg ,to_row ,to_structure_viewer ,tree ,unhide ,unselect ,wire
Start STRAP from command line
STRAP starts on clicking
strap.jar or
strap2.jnlp.
It may also be started from the command line using the command "
java" in the
bin/ directory of the Java installation:
java -jar STRAP.jar
In case of the error message "java: command not found" the complete path should be provided like:
/local/jdk1.5.0/bin/java -jar STRAP.jar
Options:
- -stderr Lets the standard error stream flow to the
terminal shell in the conventional way rather than redirecting it to
a special STRAP window available from the debug menu of
STRAP.
- -stdout Same for standard output stream
- -noSeqres If
residues of a pdb-file are recorded in SEQRES-lines but not in ATOM-lines they
appear in lower case. With this switch the SEQRES-lines are
complete ignored and only the ATOM-lines are read.
Examples with protein files:
java -jar strap.jar a1_Homo.seq a1_Saccharomyces.seq
starts STRAP and loads two protein files.
Example with sequence grouping:
java -jar strap.jar \{ a*.swiss \} \{ b*.swiss \}
loads all files starting with an "a" into one line and all files
starting with "b" into the other line.
(See
Stacking proteins into a single line ...).
Example with grep:
java -jar strap.jar $(grep -l -i '^OS.*human' *.swiss)
loads all Swissprot files with the organism human. The option
-i makes the search case insensitive.
Example with pdb chain identifier:
java -jar strap.jar pdb1ryp.ent:A
loads only chain A of the proteasome.
Example with pdb chain identifier using underscore:
java -jar strap.jar pdb1ryp_A.ent
loads only chain A of the proteasome.
Example with residue subset:
java -jar strap.jar 'pdb1ryp.ent!20-30,50-66'
loads only the given residue ranges of the proteasome.
Example with the at-sign:
java -jar strap.jar @list
loads all files found in file
list.
The text file "proteins.list" contains the previously loaded files.
Example with PDB-id:
java -jar strap.jar -pdb=1sbc
loads the PDB:1sbc from the PDB-server.
Exceptions and the streams stdout and stderr
The standard streams Standard_streams stdout and stderr are
pre-connected output channels of computer program.
Stdout and stderr are collected in files residing in
"./strapTmp" unless STRAP was started with
the command line parameter "-stderr" and "-stdout".
They can be viewed in the menu "errors" of the option menu.
Alternatively, the program ControlPanel in
/local/java/jdk1.6.0_13/jre/bin/ can be used to switch the
so-called Java-console on.
In case of runtime errors like division by zero
the Java machine throws so-called
exceptions in
stderr.
STRAP on Windows
Windows32 is the world's most widely applied operation system.
Windows differs to conventional OS with respect to
- Command line syntax
- File separator character "\". On normal systems backslash is an Escape_character and not a file separator character.
- Path separator character ";"
- File system:
- File and folder names: Case insensitive, some names not possible because reserved for MS_DOS
- File system: Language dependent directory tree
- Missing hard links
- Limitation of file path length to 260 characters. See Path_(computing).
- Unusual characters in file and directory names like space and Umlaute
Always-On-Top attribut of a window: This is one of the
essential features the Windows desktop is still lacking. Fortunately
there are free and shareware programs to install the Stay-on-Top
attribut of a windows on the desktop: AlwaysOnTopMaker (recommended, because it is very small) or Acer-GridVista (free) or WindowSpace.
Some important Windows tips for STRAP users
- If STRAP cannot download files because of Web-proxy problems it might help to webstart STRAP with IE.
- The Explorer option "suppress known file extensions" should be turned off for the entire system.
With the default Explorer setting some problems remain unnoticed:
Windows sometimes appends the suffix ".txt" to file names which have already a file extension
resulting in endings like ".txt.txt" or ".pdb.txt".
Files with the ending ".jar" are sometimes arbitrary renamed and get the ending ".zip".
- The editor Notepad does not cope with file ends of some protein files. Use Notepad_plus_plus or Wordpad instead.
-
Certain file-names are reserved by the system such as "aux.pdb", "con.swissprot", "prn.fasta", "icon" and "icons".
Their use leads to loss of data.
-
Some Bioinformatics programs run only on English but
not on German Windows installations because the file directory tree differs.
Sometimes it helps to set the english language settings for the decimal separator.
-
Some bioinformatics programs expect slash as a file separator
rather than the back-slash which is the native separator on MS-Windows.
Cygwin:
STRAP uses the Gnu-compiler in Cygwin to
install Bioinformatics programs if no other is specified.
The directory for program settings:
Dot_files are used for files that store settings of application programs.
Since there had been problems on some Windows versions with file names starting with a dot, the dot is omitted.
Consequently the settings directory is "%HOMEPATH%\StrapAlign\" rather than ~/.StrapAlign/".
But if %HOMEPATH% contains white space then "C:\StrapAlign\" is used instead.
An alternative directory can be provided in the file %HOMEPATH%\.location_of_StrapAlign.ini.
Macintosh
The computer mouse of classical Macs had only one mouse button which is ergonomically better.
The right mouse button can be simulated in STRAP by holding ALT and CTRL while clicking.
Classical software installation of additional
Bioinformatics tools requires Compilers for C++ and
Fortran. The compilers are unfortunately not present and must
be installed from the DVD.