[Biococoa-dev] more ramblings

Fri Dec 3 15:23:33 EST 2004

On Dec 3, 2004, at 11:47 AM, Alexander Griekspoor wrote:

>> Well let me add some points here. Although I never liked the idea of 
>> subclassing BCSequence, I think you guys are right that if we use 
>> one-liners it is better to call them from the appropriate subclass. 
>> But I still like the idea of have the wrapper test the sequence type 
>> first before continuing. It might be nonsense to do that, but the 
>> result won't be - because there are no results.  Just returning nil 
>> will be sufficient, no need to start throwing exceptions around ;)
>
> True, the question is how we organize the wrapper (you mean the 
> BCAnnotatedSequence right, or whatever we decide to name it).

No, I meant the general wrappers that do something with a sequence 
(translate, pI, search, etc).

> There are basically two choices either let the developer separate the 
> sequence from the annotations/features part and do all manipulations 
> purely on the BCSequence, or make all methods accept besides 
> BCSequences also the BCAnnotatedSequences. This latter his some clear 
> advantages (such as the possibility to take features into account 
> while calculating the MW for example), but it also has some clear 
> problems. One thing it would mean is that we are almost forced to have 
> also three types of BCAnnotatedSequence subclasses around (Koen might 
> remark the benefit of a single BCSequence class here probably).

LOL - actually, yes I would ;). But I would suggest the following. 
BCSequence *only* takes care of managing the symbol list, more or less 
like the SymbolList class they have in BioJava. The we have 
BCAnnotatedSequence as a subclass of BCSequence. So now we have a 
symbollist + all the additional info that makes it a real molecule. 
Then, only for convenience, we subclass BCAnnotatedSequence to 
BCSequenceDNA, BCSequenceProtein, etc.

> Notes and annotations like creator, date etc are easy, they don't 
> change (and are what I would call a BCAnnotation). Features (BCFeature 
> objects) are much more of a problem, they are coupled to sequence 
> ranges (i.e. a helix from aminoacid 10 to 15), and should be kept in 
> sync while editing the sequence. The big problem here is, what 
> architecture would be the smartest way of doing this. Any suggestions?

The BioPerl docs I mentioned recently use a separate Location object. I 
need to look more closely at it, to see how useful it is. One thing we 
have to watch for is that features need to have a 1-based numbering, 
not 0-based as we have so far. One possibility could be to couple 
features with individual BCSymbols. So we tell a BCSymbol that a 
feature XX starts there. However, what happens if in the example you 
mentioned above (helix from aminoacid 10 to 15), the user edits the 
sequence and removes AA 8-12? Then the startpoint of the feature is 
gone. So, I guess that might not be a good solution, although this 
problem (if any) will also manifest itself with ther solutions.

>>
>> However, I don't like the idea that was suggested in another recent 
>> mail, to also make subclasses for DNAStrict, proteinstrict, etc.
> I copy that, definitely not, but the general BCSequence class could 
> have a simple strict boolean that can be set.

For what?

> Also, we can introduce the strict BCSequenceTypes for passing as 
> arguments...

Sounds good.

BTW, what's the difference between 'strict', 'skippingnonbases' and 
'unambiguous' ?

- Koen.