PACKAGE:charite.christo.

Introduction

It has been demonstrated in the introduction-tutorial that alignments are more accurate when 3D structure is used for computation. This tutorial shows how the 3D-structures of a given amino acid sequence is obtained. The program WIKI:BLAST will be used to identify similar sequences in the PDB-database. The protein sequence SWISS:SUBT_BACSU will serve as an example.

Retrieving the protein file

At first the protein file "SUBT_BACSU" must be fetched from the Sequence Retrieval System (SRS). Please open the dialog BUTTON:DialogFetchSRS! and type "SUBT_BACSU" into the text-area of this dialog. There are several highlight buttons, each referring to a certain string pattern. "SUBT_BACSU" matches the pattern letters/Digits-underscore-letters/Digits. When the button BUTTON:"SRS" is pressed the sequence file will be fetched from the server. The downloaded file is listed in a new result tab (see figure below).
Figure: Downloaded files appear in the left column.
JCOMPONENT:FilesFetchedFromServer#docuSnapshot()
Activate the file and press BUTTON:FilesFetchedFromServer#BUT_LAB_Load to load the protein into the alignment pane.

Blasting

Open the dialog BUTTON:DialogBlast!. Select the protein "SUBT_BACSU" and the BLAST program COMBO:Blaster_ebi_ac_uk and the database COMBO:"pdb ". Start the BLAST by pressing LABEL:ChButton#BUTTON_GO. It takes about a minute to get a BLAST result like
...

> PDB:1ST3_  mol:protein length:269  SUBTILISIN BL

 Score = 850 (304.3), Expect =  1.4E-85
Identities = 59/274, Positives = 78/274

Q 108   QSVPYGISQIKAPALHSQGYTGSNVKVAVIDSGIDSSHPDLNVRGGASFVPSETNPYQDG 167
        QSVP+GIS+++APA H++G TGS VKVAV+D+GI S+HPDLN+RGGASFVP E +  QDG
H 2     QSVPWGISRVQAPAAHNRGLTGSGVKVAVLDTGI-STHPDLNIRGGASFVPGEPST-QDG 59

....

> PDB:1TEC_E  mol:protein length:279  THERMITASE

 Score = 553 (199.7), Expect =  4.1E-54
Identities = 43/280, Positives = 62/280

Q 106   YAQSVPYGISQIKAPALHSQGYTGSNVKVAVIDSGIDSSHPDL--NVRGGASFVPSETNP 163
        Y  S  YG  +I+AP        GS  K+A++D+G+ S+HPDL   V GG  FV +++ P
H 7     YFSSRQYGPQKIQAPQAWDIA-EGSGAKIAIVDTGVQSNHPDLAGKVVGGWDFVDNDSTP 65

...
				

Convenient downloading the PDB files

The expression "PDB:1TEC_E" means chain E of PDB entry pdb1tec. Watch the mouse cursor when you move the mouse over this expression. If the link is clicked left from the colon the PDB-entry is shown in the Web-browser, but if it is clicked right from the colon the protein is loaded into STRAP.

Downloading with the download dialog

If you want to download a large number of PDB-entries listed in some sort of text like a BLAST result, then use the dialog BUTTON:DialogFetchPdb!.