Introduction
This tutorial shows how two alignments can be combined to one alignment.
As example some WIKI:Proteasome subunits are loaded:
BUTTON:Tutorials#bExampleFiles("AA")!
These subunits belong to two different classes: Bacterial subunits start with "hs_" and eukaryotic alpha 2 with "a2_".
You can align all protein sequences using the tool-button BUTTON:StrapView#button(BUT_ALIGN)! or the dialog BUTTON:DialogAlign!.
The sequence alignment algorithms work well for sequence of the same group but fail to align bacterial with eukaryotic subunits because the sequences are very dissimilar.
Aligning all bacterial sequences
Align all bacterial sequences (names starting with "hs_") choosing
a sequence based method such as COMBO:MultipleAlignerClustalW.
If the protein names can be recognized by a string pattern ( in our
example the names start with "hs_"), the selection can be performed with the tool-button
ICON:IC_SEARCH.
Stack all bacterial sequences in one line
Activate "Advanced" in BUTTON:UserProfile#newButton()!.
Use the dialog DIALOG:DialogManyInOneRow to place
all aligned bacterial sequences into one row such that hs_EscherichiaColi.pdb stays on top.
This is important because hs_EscherichiaColi.pdb is the one with c-alpha coordinates.
As a result all bacterial protein files but hs_EscherichiaColi.pdb vanish.
The number 4 in the row header indicates the number of proteins in that row.
All gaps inserted into hs_EscherichiaColi.pdb will simultaneously also be inserted into the 4 proteins.
Aligning all eukaryotic sequences
Proceed in the same way with the eukaryotic sequences.
Align both blocks three-dimensionally
In the alignment pane only two sequences are currently shown:
hs_EscherichiaColi.pdb and a2_SaccharomycesCerevisiae.dssp.
Though their sequences are very different, their 3D-structures look similar as can be seen by pressing BUTTON:"Superimpose" of
the wire-frame view: JCOMPONENT:StrapView#button(BUT_WIRE)!.
Please align both rows. If you use the Alignment-dialog then you
should select a 3D-based alignment method
like
COMBO:Superimpose_CEPROXY. The tool-button
BUTTON:StrapView#button(BUT_ALIGN)! is more convenient because it
automatically selects a 3D-method if 3D-structures are available.
Place each Sequence in a row of its own
Separating the stacked proteins is also performed in the dialog DIALOG:DialogManyInOneRow.
Select hs_EscherichiaColi.pdb and press the button and then select a2_SaccharomycesCerevisiae.dssp and press the button.
As a result you see the entire multiple alignment with all proteins.