[Biococoa-dev] more ramblings
Alexander Griekspoor
mek at mekentosj.com
Fri Dec 3 16:27:23 EST 2004
Koen,
Op 3-dec-04 om 21:23 heeft Koen van der Drift het volgende geschreven:
>
> On Dec 3, 2004, at 11:47 AM, Alexander Griekspoor wrote:
>
>>> Well let me add some points here. Although I never liked the idea of
>>> subclassing BCSequence, I think you guys are right that if we use
>>> one-liners it is better to call them from the appropriate subclass.
>>> But I still like the idea of have the wrapper test the sequence type
>>> first before continuing. It might be nonsense to do that, but the
>>> result won't be - because there are no results. Just returning nil
>>> will be sufficient, no need to start throwing exceptions around ;)
>>
>> True, the question is how we organize the wrapper (you mean the
>> BCAnnotatedSequence right, or whatever we decide to name it).
>
> No, I meant the general wrappers that do something with a sequence
> (translate, pI, search, etc).
Ok, I get it, in general you want those classes be able to handle a
general BCSequence object as well, and not only a specific subclass per
se.
>
>
>> There are basically two choices either let the developer separate the
>> sequence from the annotations/features part and do all manipulations
>> purely on the BCSequence, or make all methods accept besides
>> BCSequences also the BCAnnotatedSequences. This latter his some clear
>> advantages (such as the possibility to take features into account
>> while calculating the MW for example), but it also has some clear
>> problems. One thing it would mean is that we are almost forced to
>> have also three types of BCAnnotatedSequence subclasses around (Koen
>> might remark the benefit of a single BCSequence class here probably).
>
> LOL - actually, yes I would ;). But I would suggest the following.
> BCSequence *only* takes care of managing the symbol list, more or less
> like the SymbolList class they have in BioJava. The we have
> BCAnnotatedSequence as a subclass of BCSequence. So now we have a
> symbollist + all the additional info that makes it a real molecule.
> Then, only for convenience, we subclass BCAnnotatedSequence to
> BCSequenceDNA, BCSequenceProtein, etc.
hmm, not sure, it feels like the layer at which we then subclass is the
wrong one. But it might also be the only problem.
>
>> Notes and annotations like creator, date etc are easy, they don't
>> change (and are what I would call a BCAnnotation). Features
>> (BCFeature objects) are much more of a problem, they are coupled to
>> sequence ranges (i.e. a helix from aminoacid 10 to 15), and should be
>> kept in sync while editing the sequence. The big problem here is,
>> what architecture would be the smartest way of doing this. Any
>> suggestions?
>
> The BioPerl docs I mentioned recently use a separate Location object.
> I need to look more closely at it, to see how useful it is. One thing
> we have to watch for is that features need to have a 1-based
> numbering, not 0-based as we have so far. One possibility could be to
> couple features with individual BCSymbols. So we tell a BCSymbol that
> a feature XX starts there. However, what happens if in the example you
> mentioned above (helix from aminoacid 10 to 15), the user edits the
> sequence and removes AA 8-12? Then the startpoint of the feature is
> gone. So, I guess that might not be a good solution, although this
> problem (if any) will also manifest itself with ther solutions.
Exactly, what we have to emulate is an attributed string, that handles
exactly the same problem(s). I think in general we don't need a
location object, we need a range object and I don't see why NSRange
wouldn't be good enough (even if our system is 1-bases).
>
>>>
>>> However, I don't like the idea that was suggested in another recent
>>> mail, to also make subclasses for DNAStrict, proteinstrict, etc.
>> I copy that, definitely not, but the general BCSequence class could
>> have a simple strict boolean that can be set.
>
> For what?
For preserving the knowledge that a sequence uses a strict symbolset
(the other option would be to have a symbolset property inside the
BCSequence object.
>
>> Also, we can introduce the strict BCSequenceTypes for passing as
>> arguments...
>
> Sounds good.
>
> BTW, what's the difference between 'strict', 'skippingnonbases' and
> 'unambiguous' ?
Basically they're the same thing, and yes, we should rename them to be
similar I think...
Cheers,
Alex
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
Windows vs Mac
65 million years ago, there were more
dinosaurs than humans.
Where are the dinosaurs now?
*********************************************************
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
Windows is a 32-bit patch to a 16-bit shell for an 8-bit
operating system, written for a 4-bit processor by a 2-
bit company without 1 bit of sense.
*********************************************************
More information about the Biococoa-dev
mailing list