[Biococoa-dev] BCSequence

Alexander Griekspoor mek at mekentosj.com
Wed Aug 25 18:06:05 EDT 2004


>> Let's start a little new discussion then ;-)
>> In principle these modifications can be seen as features right? So 
>> now we have three names/kinds around in two pairs:
>> - Modifications (example: methylgroup, phosphate etc)
>> - Features (example: alpha-helix, nuclear localization signal etc)
>> Or
>> - Features (example: methylgroup, phosphate etc)
>> - Annotations (example: alpha-helix, nuclear localization signal etc)
>
>
> I think we should treat the modifications as an array of BCSymbols.
Hmm, yes and no. Indeed modifications are kind of symbols, thus they 
could have BCSymbol as their superclass. But where symbols have no clue 
of there location (that's determined by the array in which they are), 
modifications should be kept in a kind of dictionary with the location 
as key for instance. In that case the schedule below would not make 
much sense. Alternatively we could adhere to your proposal, but that 
would mean that we the modifications should be come a real subclass of 
BCSymbol (BCModification I suggest) and have methods to set/get their 
location.

> We could even make a BCModificationsArray if we add an intermediate 
> class called BCSymbolArray as follows:
>
> BCSymbolArray
> |
> |
> ------BCSequence
> |
> |
> ------BCModificationsArray
>
>
> The BCSymbolArray can actually take care of a some of the code that is 
> currently in BCSequence.

I was just wondering how you envision features in this setup then? Your 
setup groups modifications and symbols together with features as 
something else. In principle a good idea identity wise. I first had the 
idea to group modifications and features together being more distant 
from symbols. This has perhaps more advantages technical/programming 
wise as for both of these we have to keep track of locations and 
synchronization, having them as subclasses from one superclass would 
prevent a lot of duplication perhaps. I guess there's plenty to say for 
both options here.

> If we calculate the mass of a molecule, we can just iterate over the 
> BCModificationsArray to add the masses of the modifications.
Whichever we choose, that's indeed the idea.

> For me features and annotations relate more to secondary structures 
> and author's comments (as found in a swissprot or ncbi file). But 
> that's just a name game.
I see your point, indeed when the NCBI file lists phosphorylation it 
means that a particular sequence is annotated as a (potential) 
phosphorylation SITE and not as being actually phosphorylated. This is 
where I made the mistake. In that respect your right, the site is an 
annotation, an actual phosho-group on an amino-acid is a modification.

>> Modifications and features are very alike, and a modification could 
>> be seen as a special feature and thus a subclass (inheriting the 
>> add/removing/editing/syncing etc methods, but add weight, pi etc ). 
>> Also the question raises wether we should keep them in two arrays 
>> (features and modifications) or in one (features). If you display all 
>> features of a sequence, it perhaps would be nice to see the 
>> modifications as well.
>
> I think we should keep them separate. Modifications are per BCSymbol, 
> features can span a whole range of BCSymbols.
Yep

> Also maybe we should move the mass calculations into a separate class 
> that accepts a sequence to calculate the mass? The same for features, 
> pI, etc.
>
>
> For example we have the class MassCalculator with the following methods
>
> -(id) MassCalculator initWithSequence:(BCSequence *)seq
> -(id) MassCalculator initWithSubSequence:(BCSequence *)seq 
> inRange:(NSRange)aRange
> -(id) MassCalculator initWithString:(NSString *)seq
> -(id) MassCalculator initWithSubString:(NSString *)seq 
> inRange:(NSRange)aRange
>
> -(float)getMass useMassType:(BCMassType)type 
> addModifications:(BOOL)mods
>
> The getMass method iterates over all symbols and adds the mass, just 
> as we do now in the molecularWeight method.
>
>
> Then we use it as follows:
>
> MassCalculator calculator = [[MassCalculator alloc] 
> initwithSequence:mySequence];
>
> float totalMass = [calculator getMass useMassType:BCAverage 
> addModifications:YES];
>
> [calculator release];

I like the idea, looks very nice! The only thing I doubt about is if we 
should implement a string version of all methods as well. First of all 
the implementation will be completely different (it won't support 
modifications for instance, at least I would certainly not advise to 
implement string compatible ways to keep track of modifications), 
second if we keep all methods string compatible why bother using the 
sequences. Again, we should simply force people to see strings as a 
first or last step conversion only, from there it's BCSequence only. 
Other than that, it looks very promising Koen!

> <nitpicking mode on>
> I prefer to use the word 'mass' instead of 'weight'. See eg the 
> description in <http://en.wikipedia.org/wiki/Weight>. If we want to 
> keep a method molecularWeight around that's fine with me, we could 
> just have it return the result of getMass using the averageMass type, 
> which is the same value.
> <nitpicking mode off>

Hey, you're the mass spec guy, who are we to do it otherwise? ;-)
Mass is perfectly fine by me. Keep the molecularWeight around and just 
make it a convenience method indeed.

>> 1 The molecular weight method should take modifications into account. 
>> A methylgroup adds weight. Thus modifications should have an 
>> addedWeight: mode: method  that can accept negative values as well 
>> (if the modification removes more weight than it adds).
>
> Just put a negative value in the plist, and it will substract it when 
> summing all modifications.
That's what I meant, just wanted to remind that you can have 
subtraction as well...

>> I think that's a pretty easy choice, as we don't know of anyone using 
>> BioCocoa at the moment, I think we should just start from scratch.
>
> I agree, but it think it would be fair to Peter to let him have his 
> say as well. He started BioCocoa and the IO classes are all his code 
> and I don't want to throw that away :)

You're right, perhaps I was a bit fast here, sorry for that, but given 
Peter's reaction on the last time we had a similar question, I thought 
to be on the safe side... Guess Peter will let us know ;-)

*********************************************************
                     ** Alexander Griekspoor **
*********************************************************
               The Netherlands Cancer Institute
               Department of Tumorbiology (H4)
          Plesmanlaan 121, 1066 CX, Amsterdam
                    Tel:  + 31 20 - 512 2023
                   Fax:  + 31 20 - 512 2029
                   AIM: mekentosj at mac.com
                  E-mail: a.griekspoor at nki.nl
               Web: http://www.mekentosj.com

               4Peaks - For Peaks, Four Peaks.
        2004 Winner of the Apple Design Awards
                Best Mac OS X Student Product
              http://www.mekentosj.com/4peaks

*********************************************************




More information about the Biococoa-dev mailing list