[Biococoa-dev] peptides and proteins

Alexander Griekspoor mek at mekentosj.com
Tue Sep 7 18:38:49 EDT 2004


> I agree we should make BCProtease and BCRestrictionEnzyme work in a 
> similar way. But I'm not sure why we need an additional class that 
> executes the digest, as you propose.
> This is how I do the digest for proteins:
> 1. pass the sequence to a 'digest' object
> 2. tell the digest object which enzyme to use, it then retireves that 
> info from a plist (see below)
> 3. get the NSString from the sequence, and make a NSScanner based on it
> 4. using the info in the plist, let the scanner find out where to cut 
> the sequence, and store the numbers
> 5. after the scanner is finished, make ranges based on the numbers, 
> and then subsequences
> 6. return an array with the subsequences, these are the peptides

I get it, and that works indeed fine. The problem here is that the 
moment you get an array with fragments, you loose all info about the 
digest. A digest object could encapsulate that. But indeed you're right 
and perhaps you're only interested in sequences. So maybe we should 
offer both options. A BCDigest could also be implemented as an 
encapsulated message, storing both the enzymes, the sequence, and the 
fragments. You then would have to options, either send the 
BCUtilDigester the individual items and get back an array with 
fragments, or send it a digest object and let the digester fill it's 
fragment array. Or do you think keeping the digest info around is 
something we should completely hand of to the app developer?

The other think I would like to add to a BCSequence is the info of 
which enzyme produced the 5' end and which the 3' end. Therefore, I 
proposed the BCFragment class, which could be a subclass of BCSequence 
that stores these additional BCEnzyme variables (which can also be nil 
by default if the end is untreated). The nice thing is that for the 
BCDigest story above, nothing changes still get an array of BCSequences 
returned, but as a convenience the digest object fills in which enzyme 
produced the ends.

An alternative way that would solve the problem is to store this 
information in the (to be implemented) features/annotations of a 
BCSequence. But I very much like the ideas above better.

> We could have additional parameters passed to the digest object, such 
> as to allow missed cleavages.
> an example of an entry in the plist (for trypsin) is:
> 	<key>Trypsin</key>
> 	<dict>
> 		<key>CleaveAt</key>
> 		<string>KR</string>
> 		<key>DontCleaveBefore</key>
> 		<string>P</string>
> 		<key>CleaveDirection</key>
> 		<string>C</string>
> 	</dict>
> I use this in my own app, and it works very nice.

Very nice indeed, the plists are definitely the way to go. Still, for 
restriction enzymes instantiating 600 enzymes each time would be to 
expensive I think, so that's where it would be nice to instantiate once 
from the plist and keep the objects around in a static dictionary.

                     ** Alexander Griekspoor **
              The Netherlands Cancer Institute
              Department of Tumorbiology (H4)
         Plesmanlaan 121, 1066 CX, Amsterdam
                    Tel:  + 31 20 - 512 2023
                    Fax:  + 31 20 - 512 2029
                   AIM: mekentosj at mac.com
                    E-mail: a.griekspoor at nki.nl
                Web: http://www.mekentosj.com

	Claiming that the Macintosh is inferior to Windows
	because most people use Windows, is like saying
	that all other restaurants serve food that is
	inferior to McDonalds


More information about the Biococoa-dev mailing list