[Biococoa-dev] more ramblings
Koen van der Drift
kvddrift at earthlink.net
Fri Dec 3 15:23:33 EST 2004
On Dec 3, 2004, at 11:47 AM, Alexander Griekspoor wrote:
>> Well let me add some points here. Although I never liked the idea of
>> subclassing BCSequence, I think you guys are right that if we use
>> one-liners it is better to call them from the appropriate subclass.
>> But I still like the idea of have the wrapper test the sequence type
>> first before continuing. It might be nonsense to do that, but the
>> result won't be - because there are no results. Just returning nil
>> will be sufficient, no need to start throwing exceptions around ;)
>
> True, the question is how we organize the wrapper (you mean the
> BCAnnotatedSequence right, or whatever we decide to name it).
No, I meant the general wrappers that do something with a sequence
(translate, pI, search, etc).
> There are basically two choices either let the developer separate the
> sequence from the annotations/features part and do all manipulations
> purely on the BCSequence, or make all methods accept besides
> BCSequences also the BCAnnotatedSequences. This latter his some clear
> advantages (such as the possibility to take features into account
> while calculating the MW for example), but it also has some clear
> problems. One thing it would mean is that we are almost forced to have
> also three types of BCAnnotatedSequence subclasses around (Koen might
> remark the benefit of a single BCSequence class here probably).
LOL - actually, yes I would ;). But I would suggest the following.
BCSequence *only* takes care of managing the symbol list, more or less
like the SymbolList class they have in BioJava. The we have
BCAnnotatedSequence as a subclass of BCSequence. So now we have a
symbollist + all the additional info that makes it a real molecule.
Then, only for convenience, we subclass BCAnnotatedSequence to
BCSequenceDNA, BCSequenceProtein, etc.
> Notes and annotations like creator, date etc are easy, they don't
> change (and are what I would call a BCAnnotation). Features (BCFeature
> objects) are much more of a problem, they are coupled to sequence
> ranges (i.e. a helix from aminoacid 10 to 15), and should be kept in
> sync while editing the sequence. The big problem here is, what
> architecture would be the smartest way of doing this. Any suggestions?
The BioPerl docs I mentioned recently use a separate Location object. I
need to look more closely at it, to see how useful it is. One thing we
have to watch for is that features need to have a 1-based numbering,
not 0-based as we have so far. One possibility could be to couple
features with individual BCSymbols. So we tell a BCSymbol that a
feature XX starts there. However, what happens if in the example you
mentioned above (helix from aminoacid 10 to 15), the user edits the
sequence and removes AA 8-12? Then the startpoint of the feature is
gone. So, I guess that might not be a good solution, although this
problem (if any) will also manifest itself with ther solutions.
>>
>> However, I don't like the idea that was suggested in another recent
>> mail, to also make subclasses for DNAStrict, proteinstrict, etc.
> I copy that, definitely not, but the general BCSequence class could
> have a simple strict boolean that can be set.
For what?
> Also, we can introduce the strict BCSequenceTypes for passing as
> arguments...
Sounds good.
BTW, what's the difference between 'strict', 'skippingnonbases' and
'unambiguous' ?
- Koen.
More information about the Biococoa-dev
mailing list