[Biococoa-dev] peptides and proteins

Koen van der Drift kvddrift at earthlink.net
Tue Sep 7 18:14:53 EDT 2004


On Sep 7, 2004, at 2:24 AM, Alexander Griekspoor wrote:

> Koen,
> Both Tom and I discussed this and we have our doubt if this is the way 
> to go. We don't really see why we should add the intermediate 
> BCSequence class. I think this is the example of a manipulation you do 
> with a BCSequence (a digestion) which returns some sort of result that 
> by incidence is a collection of BCSequences as well. Similar examples 
> are perhaps PCR reactions, and especially DNA enzyme digests. To take 
> the latter as example, and how I would like to implement this is by 
> modelling things to real life.
>
> I propose the following new classes:
>
> BCEnzyme
> |
> ---------------BCRestrictionEnzyme
> |
> ---------------BCProtease
> |
> ---------------etc
>
> These can hold variables like optimal temperature, buffer conditions 
> etc. Not completely sure, but it might again be very handy to have a 
> predefined set as singletons (instead of instantiating 600 restriction 
> enzymes for each digest, let the BCRestrictionEnzyme create a 
> singleton dictionary the first time and access them from there the 
> next time).
>
> Then a BCUtilDigester(DNA/Protein) or something that can do the digest 
> ala your masscalculator (which I would like to call 
> BCUtilMassCalculator if that's ok by the way).


I agree we should make BCProtease and BCRestrictionEnzyme work in a 
similar way. But I'm not sure why we need an additional class that 
executes the digest, as you propose.

This is how I do the digest for proteins:

1. pass the sequence to a 'digest' object

2. tell the digest object which enzyme to use, it then retireves that 
info from a plist (see below)

3. get the NSString from the sequence, and make a NSScanner based on it

4. using the info in the plist, let the scanner find out where to cut 
the sequence, and store the numbers

5. after the scanner is finished, make ranges based on the numbers, and 
then subsequences

6. return an array with the subsequences, these are the peptides


It's fine with me if we use BCSequenceProtein for the peptides.



We could have additional parameters passed to the digest object, such 
as to allow missed cleavages.



an example of an entry in the plist (for trypsin) is:

	<key>Trypsin</key>
	<dict>
		<key>CleaveAt</key>
		<string>KR</string>
		<key>DontCleaveBefore</key>
		<string>P</string>
		<key>CleaveDirection</key>
		<string>C</string>
	</dict>


I use this in my own app, and it works very nice.


- Koen.




More information about the Biococoa-dev mailing list