[Biophp-dev] Release criteria

Serge Gregorio biophp-dev@bioinformatics.org
Fri, 09 May 2003 02:07:43 +0800


Hello all!

>> By the way, I'm working a variety of stuff, from the Enzyme
>> class to some esoteric sequence alignment code.  That's why
>> I used the term "miscellaneous code" in the action plan.
>
>Spiffy, looking forward to seeing it!

Well, here's a sneak preview (see end of this email).  Sorry for 
the tabs.  My web IDE keeps placing them...  =(

>Possibly - I wouldn't bother with a "set" release date, personally.
>Instead - what capabilities should make up the next "formal release"
>(i.e. "it's ready to release when THESE capabilities are implemented
>and tested")?

Oh okay... any suggestions (about release criteria)?  

As for the opening/closing the file stream issue, I guess
a safe compromise would be to give the user the choice of
whether the stream is perpetually open until explicitly 
closed, or if it's closed automatically after reading the
data.  Again, it depends on how the user uses the parser
class.

Regards,

Serge

==================================================

class Enzyme
   {
   var $entry;
   var $name;
   var $class;
   var $sysname;
   var $reaction;
   var $substrate;
   var $product;
   var $comment;
   var $reference;
   var $pathway;
   var $ortholog;
   var $genes;
   var $disease;
   var $motif;
   var $structures;
   var $dblinks;
   }

function parse_enzyme_kegg($flines)
   {
   // Initialization of variables.

   $in_name_flag = FALSE;
   $aNames = array();
   $name_string = "";
	
   $in_sysname_flag = FALSE;
   $sysname_string = "";
	
   $in_react_flag = FALSE;
   $react_string = "";
   $aReactions = array();
	
   $in_sub_flag = FALSE;
   $sub_string = "";

   while( list($lineno, $linestr) = each($flines) )
      { // OPENS outermost while() loop
		
      $label = trim(left($linestr, 12));  

      // Assume that ENTRY is always one line.
      if ($label == "ENTRY") $entry = trim(substr($linestr, 12));

      // NAME entry is made up of one or more names, the preferred
      // name is at the first line, other alternative names are in 
      // succeeding lines.  It is possible for a long name to occupy 
      // two or more lines.  But for now, assume one name/one line.
      if ($label == "NAME")
         {
         $aNames = array();
         $aNames[] = trim(substr($linestr,12));
         $in_name_flag = TRUE;
         }
      elseif ( (strlen($label) == 0) and ($in_name_flag) )
         {
         $aNames[] = trim(substr($linestr,12));
         }
      elseif ( (strlen($label) > 0) and ($in_name_flag) ) 
         {
         $in_name_flag = FALSE;
         }
			
      // CLASS field - first line is the class, 2nd is subclass,
      // 3rd is the sub-subclass.  Store as an array.	
      if ($label == "CLASS")
         {
         $aClasses = array();
         $aClasses[] = trim(substr($linestr,12));
         $in_class_flag = TRUE;
         }
      elseif ( (strlen($label) == 0) and ($in_class_flag) )
         {
         $aClasses[] = trim(substr($linestr,12));
         }
      elseif ( (strlen($label) > 0) and ($in_class_flag) ) 
         {
         $in_class_flag = FALSE;
         }							
			
      // SYSNAME field may be one or more lines.  SYSNAME follows conventions
		// for multiple lines as stated in the Ligand Manual (e.g. uses $ sign).
      if ($label == "SYSNAME") 
         {
         $sysname_string = trim(substr($linestr,12));
         $in_sysname_flag = TRUE;
         }
      elseif ( (strlen($label) == 0) and ($in_sysname_flag) )
         {		
			if (left(trim(substr($linestr,12)),1) == '$')
			   $sysname_string .= trim(substr($linestr,13));
			else
				$sysname_string .= " " . trim(substr($linestr,12));
         }
      elseif ( (strlen($label) > 0) and ($in_sysname_flag) ) 
         {
         $sysname = trim($sysname_string);
         $in_sysname_flag = FALSE;
         }		
			
      // REACTION field may be one or more lines.  Format is different from
		// the REACTION field in 'COMPOUNDS' file.  Follows the '$' convention
		// for multiple lines.  When doing a <PRE> of $aReactions array, it
		// splits each string into one or more lines (\n) but the value is not
		// affected (it's still one string).
      if ($label == "REACTION") 
         {
         $react_string = substr($linestr,12);
         $in_react_flag = TRUE;
         }
      elseif ( (strlen($label) == 0) and ($in_react_flag) )
         {		
			if (left(trim(substr($linestr,12)),1) == '$')
			   $react_string .= substr($linestr,13);
			else
				$react_string .= " " . substr($linestr,12);
         }
      elseif ( (strlen($label) > 0) and ($in_react_flag) ) 
         {
         $reaction = trim($react_string);
			$aReactions = preg_split("/;/", $reaction, -1, PREG_SPLIT_NO_EMPTY);
			array_walk($aReactions, "trim_element");		
         $in_react_flag = FALSE;
         }					
			
		// SUBSTRATE field - contains entries found in the LEFT SIDE of the
		// REACTION equation (before the '=' sign).  Items are to be stored
		// in an array.  Parsing technique will be a combination of those
		// used for CLASS and REACTION (array of possibly multi-line entries, 
		// following the "$ convention").
		
		/* Get one line.  Store in tempstring.
         Get next line.  Check if first char is '$'.  If it is, concatenate 
			   to tempstring.  If not, then it's a new entry, store tempstring
				to array of substrate entries. 
		*/
      if ($label == "SUBSTRATE")
         {
         $aSubstrates = array();
			$sub_string = substr($linestr,12);
         $in_sub_flag = TRUE;
         }
      elseif ( (strlen($label) == 0) and ($in_sub_flag) )
         {
			if (left(trim(substr($linestr,12)),1) == '$')			
				{ 
				// current line is a continuation of the previous line(s).
				$sub_string .= substr($linestr,13);
				}
			else
			   {
				// 
	         $aSubstrates[] = trim($sub_string); 
				$sub_string = substr($linestr,12);				
				}		
         }
      elseif ( (strlen($label) > 0) and ($in_sub_flag) ) 
         {
         $aSubstrates[] = trim($sub_string); 
         $in_sub_flag = FALSE;
         }							
					
      if ($label == '///') break;
      } // CLOSES outermost while() loop

   $oEnzyme = new Enzyme();
   $oEnzyme->entry = $entry;
   $oEnzyme->name = $aNames;
   $oEnzyme->class = $aClasses;
   $oEnzyme->sysname = $sysname;	
   $oEnzyme->reaction = $aReactions;
   $oEnzyme->substrate = $aSubstrates;	

   return $oEnzyme;	
   }


Need a new email address that people can remember
Check out the new EudoraMail at
http://www.eudoramail.com