PACKAGE:charite.christo. PACKAGE:charite.christo.strap.

Introduction

Visualization of residue positions in 3D structures is an important step towards the interpretation of mutations or polymorphisms in terms of protein function, interaction and thermodynamic stability. It helps to identify phenotype - genotype correlations. For an increasing number of proteins, that may be analyzed and screened for mutations, three-dimensional protein structures can be found in the PDB-database. Selecting and highlighting large numbers of residue positions in a protein structure can be time-consuming and tedious and is error-prone when performed by hand. STRAP facilitates mapping of mutations onto three-dimensional protein structures. This tutorial explains how all known point mutations of the human cardiac beta WIKI:myosin are visualized on the 3D-structure of an homologous protein. Mutations in this gene usually cause WIKI:Hypertrophic_cardiomyopathy. Only a very few mutations produce a completely different heart disease, WIKI:Dilated_cardiomyopathy. Also see publication PUBMED:16322575. With the techniques described here it is possible to compare the 3D location of both groups of mutations.

Overview of the program modules used in this tutorial

STRAP offers several program modules which are involved.
  1. If the designations of the mutations refer to sites in genomic DNA, STRAP translates the sequence file from DNA to amino acids: DIALOG:DialogGenbank. The reading orientation and the intron/exon boundaries can be altered manually: DIALOG:EditDnaParentPane
  2. If the structure of the protein of interest is not yet available, a related protein can frequently be found in the structure databases using DIALOG:DialogBlast. In this case the alignment of both proteins becomes the crucial part of the analysis: DIALOG:DialogAlign
  3. The mutations found in public mutation databases or mutations identified in the own lab are imported as a text list of mutation designations. DIALOG:DialogResidueAnnotationList
  4. The structure is viewed in Astex or another 3D-viewer: DIALOG:Dialog3DViewer

Tutorial Start

Open the BUTTON:UserProfile#newButton()! and activate the check-boxes "annotations". To download human beta cardiac myosin as a Genbank nucleotide file M57965 you need to click M57965 in the following link: NCBI_NT:M57965. Alternatively, you may also download M57965 from EMBL: EMBL:M57965. Note that if there were no links for M57965 in this tutorial, the following dialog could have been used for fetching protein files: BUTTON:DialogFetchSRS!.
Exercise:
  1. View the protein file and find the CDS-text string describing the positions of exons. Use BUTTON:ProteinPopup#docuSelectedEditProtein()!

Translate the Genbank or Embl file

The file M57965 contains a nucleotide sequence which is translated into amino acids using the CDS-expression join(5721..5921,6225..6368,6657..6813,6945.. ) contained in this file. If there is only one gene record in the file then this gene is taken and converted into amino acids automatically upon loading. If there are two or more genes, the dialog in the WIKI:Context_menu BUTTON:DialogGenbank! is used. After translation into amino acids, the gene structure can be inspected (and changed) with the tool-button "DNA sequence".

Import the mutations or SNPs

Sometimes mutation data is recorded in the protein file which can be imported and visualized on the protein with a few clicks. Example:
P51587:FT   VARIANT    3118   3118       M -> T (in BC).
P51587:FT                                /FTId=VAR_005110.
    
With the tool-button BUTTON:StrapView#button(BUT_FEATURE)! the sequence feature dialog can be opened. In this dialog all sequence features recorded in the protein file can be imported.

For mutations not recorded in the sequence file, a list of mutations must be found in the Web for example GOOGLE{ mutations cardiac myosin }. Open the dialog BUTTON:DialogResidueAnnotationList! Select the protein M57965 in the dialog and paste the following list of mutations into the multi-line text field of the dialog: Remember the short-cuts CTRL-C and CTRL-V for copy and paste, respectively. Then press LABEL:ChButton#BUTTON_GO to generate the residue annotation in M57965 for each mutation.
  R054X V059I T124I R143Q Y162C N187K R190T Q222K N232S
  F244L K246Q R249Q G256E I263T A326P K383N L390V
  R403L R403Q R403W V406M R453C R453H R453L E483K Q499K 
  F513C G584R D587V L601V N602S V606M K615N 
  R663C R663H R694C N696S R712L G716R R719Q R719W R723C R723G P731L 
  I736M G741A G741R G741W D778G S782N A797T E846K E846Q R869C 
  R870C R870H M877K L908V E924K E930K E930del E935K E949K L961R
  F764L S532P
    
These mutations had been obtained from http://genetics.med.harvard.edu/~seidman/cg3/ and http://www.angis.org.au/pbin/Databases/Heart/fhcquery.cgi. Further collections of WIKI:Mutations and WIKI:Single_nucleotide_polymorphisms (SNPs) can be found with Web-search engines using search terms like "mutation database". Some important sites are listed in the following.: These residue annotations have a WIKI:Context_menu (right-click) and a WIKI:Balloon_help. When LABEL:ResidueSelectionPopup#ACTION_edit in the context menu is clicked, the residue annotations are shown as a table. The table has one row for the annotation name, one row for the annotation group and one row for the sequence position. Additional rows can be created to store further information.
Figure: Annotated selection of amino acids is shown as a table.
JCOMPONENT:ResidueAnnotationView#docuView()

Exercise:
  1. Chose one mutation and change the color.
  2. Edit one mutation and add comments.
  3. Simultaneously select two mutations and change their color. Two mutations can be selected in the alignment with the CTRL-key.
  4. Select several mutations by dragging a rectangular region.

Referring to nucleotide sequence positions

The dialog BUTTON:DialogResidueAnnotationList! can also be used when positions refer to nucleotides instead of amino acids. In this case "nucleotides" must be selected in the choice menu COMBO:DialogResidueAnnotationList#NT_INDEX. Please add the following two mutations:
    C12164T G7799A
    
You might use a different color so that they can be distinguished. Since these mutations are defined by nucleotide positions they can be best seen in the nucleotide view. Select M57965 and click: ITEM:EditDnaParentPane. The entire nucleotide sequence in the file is very large (26689 bp) and navigation with the scrollbar is relatively difficult. This is because the coding exons are relatively short compared to the non-coding introns. The arrow keys allows to jump from one exon/intron start to the next.

View DNA sequence in Alignment Pane

Nucleotides can be shown in the alignment pane simultaneously with amino acids. Select the last item "nucleotide" in JCOMPONENT:StrapView#choiceShading()!. In M57965 each amino acid corresponds to three nucleotides (hence the term triplet). The nucleotides are shown as colored bars, guanine=GRAY, adenine=green, thymine=red and cytosine=blue.
Exercise:
  1. Identify the sequence position of the mutations C12164T and G7799A in the alignment pane.

Loading the 3D-structure file

First watch STRING:ChConstants#MOVIE_Sequence_Features_in_3D The goal is to view the mutations in a 3D-structure. Because a three-dimensional structure for the cardiac protein is not yet available, we use the structure of myosin of skeletal muscle instead. The PDB entry 1ALM is a theoretical model of the relative position of the backbones of myosin and five actin monomers. The myosin heavy chain is chain A in 1ALM and the actins are chains V, W, X, Y and Z. The X-ray structure 2MYS, however does not contain actin but contains the amino acid side chains. Therefore both are combined to one file for this tutorial and can be loaded by clicking JCOMPONENT:Tutorials#bExampleFiles("MUT")!.
Exercise:
  1. Find homologs with known 3D-structure for M57965 in the PDB database. Use BUTTON:DialogBlast!. This is the usual way how candidate structures for 3D-visualization of mutations are found.

Aligning myosin from skeletal muscle and heart muscle

In order to correctly copy the mutations from M57965 to 1alm_2mys, both sequences must be aligned. Select both proteins in the row header of the alignment by Ctrl Left-click. Then press the tool-button "Align 2 proteins".
Exercise:
  1. Discuss the degree of similarity of both proteins. Are there regions that are more conserved than others?
  2. What is the chain identifier (capital Letter) of the myosin subunit in pdb1alm?

3D-visualization of the backbone

In order to visualize the protein structure in very simple way, mark all alignment rows with a pdb structure file and open the 3D-wire representation JCOMPONENT:StrapView#button(BUT_WIRE)!.
Exercise: Find out
  1. How the model is rotated and zoomed.
  2. How alignment positions can be related to 3D-positions by clicking into the 3D-model or by walking with the cursor.

3D-visualization in Astex

Astex is an excellent 3D-viewer integrated in STRAP. Load the protein into Astex by clicking the button LABEL:ProteinViewerPopup#BUT_ASTEX above the protein backbone view or using the dialog DIALOG:Dialog3DViewer.

You can drag residue annotations such as mutations directly into Astex, even if they are placed on a different protein. See below for how to transfer mutations.

Transfer the mutations to 3D

Currently the mutations are shown in M57965 but we want to see them in the protein structure file "1alm_2mys_A.ent". First we need to select the residue annotations of M57965. They can be selected by CTRL + left-click. Selected annotations are indicated by WIKI:Marching_ants. For large numbers clicking each selection would be tedious. The following lists different methods to select many residue selections: The selected residue annotations are indicated by marching ants. Now drag one of the selected residue annotations to the 3D-view.

Using Pymol instead if Astex

If you are used to WIKI:Pymol rather then Astex you may want to use Pymol. It can be opened in DIALOG:Dialog3DViewer. Pymol will initially not display the actins since pdb1alm is a model containing only C-alpha atoms but no amino acid side chains. To make the C-alpha atoms visible, select "nonbonded" in Pymol. This is done in Pymol popup menus "[S]" in the right side panel of Pymol.

Since Pymol is not a Java application, you cannot drag the mutations onto the Pymol panel. Pymol does not act as a drop target directly. Instead get the object tree panel by dragging the vertical divider at the left margin of STRAP to the right. You find a pymol tree node under the respective protein node. This tree node acts as a drop target.

Related resources

Windows-users may have a look at TopoSNP, http://gila.bioengr.uic.edu/snp/toposnp/ , PUBMED:14681472: The 3D locations of many mutations are shown using the Chime browser plugin. Sarkomer proteins, however, are not contained in TopoSNP.