[Biococoa-dev] peptides and proteins
Alexander Griekspoor
mek at mekentosj.com
Tue Sep 7 18:38:49 EDT 2004
Koen,
> I agree we should make BCProtease and BCRestrictionEnzyme work in a
> similar way. But I'm not sure why we need an additional class that
> executes the digest, as you propose.
>
> This is how I do the digest for proteins:
>
> 1. pass the sequence to a 'digest' object
>
> 2. tell the digest object which enzyme to use, it then retireves that
> info from a plist (see below)
>
> 3. get the NSString from the sequence, and make a NSScanner based on it
>
> 4. using the info in the plist, let the scanner find out where to cut
> the sequence, and store the numbers
>
> 5. after the scanner is finished, make ranges based on the numbers,
> and then subsequences
>
> 6. return an array with the subsequences, these are the peptides
I get it, and that works indeed fine. The problem here is that the
moment you get an array with fragments, you loose all info about the
digest. A digest object could encapsulate that. But indeed you're right
and perhaps you're only interested in sequences. So maybe we should
offer both options. A BCDigest could also be implemented as an
encapsulated message, storing both the enzymes, the sequence, and the
fragments. You then would have to options, either send the
BCUtilDigester the individual items and get back an array with
fragments, or send it a digest object and let the digester fill it's
fragment array. Or do you think keeping the digest info around is
something we should completely hand of to the app developer?
The other think I would like to add to a BCSequence is the info of
which enzyme produced the 5' end and which the 3' end. Therefore, I
proposed the BCFragment class, which could be a subclass of BCSequence
that stores these additional BCEnzyme variables (which can also be nil
by default if the end is untreated). The nice thing is that for the
BCDigest story above, nothing changes still get an array of BCSequences
returned, but as a convenience the digest object fills in which enzyme
produced the ends.
An alternative way that would solve the problem is to store this
information in the (to be implemented) features/annotations of a
BCSequence. But I very much like the ideas above better.
> We could have additional parameters passed to the digest object, such
> as to allow missed cleavages.
>
> an example of an entry in the plist (for trypsin) is:
>
> <key>Trypsin</key>
> <dict>
> <key>CleaveAt</key>
> <string>KR</string>
> <key>DontCleaveBefore</key>
> <string>P</string>
> <key>CleaveDirection</key>
> <string>C</string>
> </dict>
>
> I use this in my own app, and it works very nice.
Very nice indeed, the plists are definitely the way to go. Still, for
restriction enzymes instantiating 600 enzymes each time would be to
expensive I think, so that's where it would be nice to instantiate once
from the plist and keep the objects around in a static dictionary.
Alex
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
Claiming that the Macintosh is inferior to Windows
because most people use Windows, is like saying
that all other restaurants serve food that is
inferior to McDonalds
*********************************************************
More information about the Biococoa-dev
mailing list