[Biococoa-dev] peptides and proteins
Koen van der Drift
kvddrift at earthlink.net
Wed Sep 8 23:38:49 EDT 2004
On Sep 8, 2004, at 2:23 AM, Alexander Griekspoor wrote:
>> BCProtease *protease = [[BCProtease alloc] initWithSequence: aSeq]
> Why BCProtease? This should be a BCDigest subclass
> (BCDigestDNA/RNA/Protein) right?
> [BCDigest *digest = [[BCDigest alloc] initWithSequence: aSeq];
> Then the next step would be to instantiate the enzyme:
>> BCProtease *protease = [BCProtease enzymeWithName: @"trypsin"];
> In principle BCProtease would be a subclass of BCEnzyme, just as
> BCRestrictionEnzyme, which all would have class methods to call for
> predefined enzymes (from a singleton dictionary), and methods to
> instantiate new ones from scratch/plist (like the BCSymbol
>> [protease setEnzyme: @trypsin];
> the reason for seeing things as digests instead of proteases would be
> to allow cleavage with multiple enzymes, like is commonly the case
> with restriction enzymes. Therefore, the enzymes should be an array
> which you can add and remove enzymes from.
> this line would then become:
> [digest addEnzyme: protease];
Good idea. I will see how that fits in my code. I hope we can make a
general BCDigest class, without subclassing. Although I am not sure yet
how to implement multiple enzymes. Should they be handled one by one,
or all at the same time (by 'summing their cleavage sites')?
>> [protease digest]
> [digest performDigestion];
> would be a convenient way to start the digestion on cue, but we can
> also let the internal methods give the cue automatically if you ask
> for the digest results. In addition, if the object is kept around,
> adding and removing enzymes while a previous result is present should
> trigger a redigest.
Sounds like a good plan.
> Something we have to watch out for is that the sequence object
> contained in the object is a mutable one, so potentially can be
> changed underneath us. Unless we do not store a pointer, but would
> copy it. This however might be expensive.
If we just store the sequenceString, which makes the use of an
NSScanner very easy, then we can store it as an NSString:
-(id) initWithSequence:(BCSequence *)seq
if (self = [super init])
[self setSequenceString: [seq sequenceString]];
- (void) setSequenceString:(NSString *)s
sequenceString = s;
> So perhaps this is one of the examples where it would be handy to have
> both a mutable and immutable variants of the BCSequence class. Unless
> anyone of you can shed more light on the issue.
See snippet above.
>> NSArray *thePeptides = [digest digestResult];
> That would be the idea. This means that the result is cached by the
> digest object right?
>> Yes, that's taken care of in the plist using the CleaveDirection key.
>> We have to add some code like:
>> [newPeptide setCleavedAt: cleavedAtN]; // or 5' or 3' or
> Well that's not exactly what I meant. When you cut vector DNA with
> for instance EcoRI and BamHI, you would get for example:
> Fragment 1: EcoRI---------------BamHI
> Fragment 2: BamHI--------------EcoRI
> So what I thought was to store in a new BCSequenceDNA subclass, called
> BCFragmentDNA two variables like
> [fragment1 set5EndEnzyme: ecori]; // ecori and bamhi are of class
> BCEnzyme, or BCRestrictionEnzyme to be more precise
> [fragment1 set3EndEnzyme: bamhi];
> indeed set by the digest object.
> for peptides that would be
> [peptide setCarboxyEnzyme: nil];
> [peptide setAminoEnzyme: trypsin];
> Although I hate the set5EndEnzyme already so if anyone could come up
> with a better name, ideally spanning all sequence types
> Finally, besides the enzymes, the fragment class also needs to store
> the position it represents within the uncut sequence (see below)
>>> Therefore, I proposed the BCFragment class, which could be a
>>> subclass of BCSequence that stores these additional BCEnzyme
>>> variables (which can also be nil by default if the end is
The fragments are just sequences, and once created they no nothing
about where they originate from (just as in a petridish). Why not keep
that data in the BCDigest class that did the actual cutting? But I am
open to more discussion, because below I suggested a BCPeptide class :)
Or we make a BCDigest return a dictionary that looks something like:
There - all info stored together :)
> Like a BCDigest, you could think in the direction of a very analogous
> BCMap which would return instead of an array of fragments, an array of
> positions. You would feed BCMap, a single sequence, enzyme(s), and it
> would return all cut positions.
This is already how I code my digest class. First create an array of
cutpositions using the NSScanner, then feed those numbers to the actual
digest, which returns the fragments.
> The question is whether the mapping and digestion requires two
> separate classes, I think we can fuse them into one, as long as we
> provide sufficient methods to let them act in both ways.
Yes - see comment above.
More information about the Biococoa-dev