[Biophp-dev] PDB: I'll take a break...

Serge Gregorio biophp-dev@bioinformatics.org
Sat, 17 May 2003 03:50:06 +0800


Well, here's the code that's driving me batty...

Regards,

Serge

==============================================

class Protein     // PDB version 
{
var $class;
var $dep_date;
var $id_code;

// OBSLTE group
var $date_rep;         // short for 'date replaced'.
var $new_id_code;      // array of id codes.
// TITLES group
var $titles;           // array of titles.
// CAVEAT group
var $caveats;          // array of caveats.
// COMPND group
var $compounds;       // array of compound entries, each entry is 
                      // a string of this form: 
                      //  MOL_ID: 1; token1: value1; token2: value2; 
// SOURCE group
var $sources;        // array of info about biological sources of
                     // the molecules.
// KEYWDS group
var $keywords;        // array of keywords (strings).
// EXPDTA group
var $expdta;          // array of experimental (technique?) data.  
// AUTHOR group
var $authors;         // array of authors (strings)
// REVDAT group
var $revdat;          // array of REVISION DATA (2D assoc array).     
var $sprsde;          // array of SUPERSEDED ENTRIES 
var $journal;
var $remark1;
var $remark2;
var $remark3;
var $remark4;
// DBREF group 
var $dbrefs;          // array of database references 
// SEQADV group
var $seqadv;          // array of seqadv records 
// SEQRES group
var $seqres;          // array of SEQUENCE RESIDUE records
// MODRES group
var $modres;	      // array of MODIFICATION OF RESIDUE entries  
// HET group
var $hets;           // array of HETEROGENOUS ATOMS
// HETNAM group 
var $hetnams;        // array of HETEROGENOUS (ATOMS) NAMES 
// HETSYN group
var $hetsyns;        // array of SYNONYMS for HETEROGENOUS ATOMS 
// FORMUL group
var $het_formulas;   // array of (CHEMICAL) FORMULAS FOR
                     // HETEROGENOUS ATOMS
// HELIX group
var $helix;         // array of HELICES (associative array).
// SHEET group  
var $sheets;        // array of SHEETS (secondary structures) stored 
                    // as assoc array.
// TURN group
var $turns;         // array of TURNS (2ndary structures) 
// SSBOND group
var $ssbonds;        // array of disulfide bonds in protein and polypeptide structures.
// LINK group
var $links;          // array of links (between residues).
// HYDBND group
var $hydbnds;        // array of hydrogen bonds(?).
// SLTBRG group
var $sltbrgs;        // array of salt bridges b/w residues.
// CISPEP group
var $cispeps;        // array of Cis peptides (those with 
                     // omega angles of 0°±30°. 
                     // Deviations larger than 30° are listed in
                     // REMARK 500. 				
// SITE group
var $sites;          // array of significant sites in the 
                     // macromolecule.				
// CRYST1 group
var $cryst1;         // array of CRYST1 unit cell parameters.
// ORIGX group
var $origx;          // array of coordinates.
// SCALE group
var $scale;          // array of scales;
// MTRIX group
var $matrix;         // array of matrices.
// TVECT group
var $tvect;         // array of translation vectors. 
// MODEL group
var $model;         // array of atomic models (skip for now).
// ATOM group
var $atoms;         // array of ATOMs 
// SIGATM group
var $sigatms;       // array of STANDARD DEVIATIONS OF 
                    // ATOMIC PARAMETERS

// Some 10 other PDB data fields/sections I haven't assigned
// a PROTEIN attribute/property name yet...
}

function parse_protein_pdb($flines)
   { 
   $outer = array();
	
   $in_title_flag = FALSE;
   $title_string = "";
   $aTitles = array();	
	
   $in_caveat_flag = FALSE;
   $cav_string = "";
   $aCaveats = array();
	
   $aHelix = array();	
   $aSheets = array();
   $aTurns = array();
   $aSSBonds = array();

   // other code that initializes variables used by the
   // following IF statements.
		
   while ( list($no, $linestr) = each($flines) )
      { // opens outermost WHILE

      $label = trim(left($linestr, 6));
      $data = trim(substr($linestr, 9));
		
     // Check for UNCLOSED items by looking at flag variables, etc.  
     // UNCLOSED means they have not yet been stored to the proper
     // variables (e.g. class property/attribute or some temporary
     // placeholder).  To CLOSE means to store data into them.

     if ($in_remark1_flag)
        {
	// code to "close" REMARK 1 section.		
        }
     if ($in_title_flag)
        {
	// code to "close" REMARK 1 section.		
        }
     // a whole lot of "CLOSING ROUTINES"...
         
     // ID data field - possibly the simplest of them all!
		
     if ($label == "HEADER")
        {
        $class = substr($linestr,10,40);
	$dep_date = substr($linestr,50,9);
	$id_code = substr($linestr,62,4);
	}

     // OBSLTE - OBSOLETE data field
		
     if ($label == "OBSLTE")
        {
        // code to extract data from OSBLTE data field/section.
        }  

     // Other code to handle 30 plus data fields/sections in PDB.
     // Almost all of them start with: IF ($label == "LABEL")
     }

   $oProtein->class = $class;
   $oProtein->dep_date = $dep_date;

   // other code to store extracted data in PROTEIN class.
   }


Need a new email address that people can remember
Check out the new EudoraMail at
http://www.eudoramail.com