[Biococoa-dev] a new design to please everybody

Charles PARNOT charles.parnot at stanford.edu
Sun Jan 9 02:47:27 EST 2005


>At 2:20 PM +0100 1/8/05, Alexander Griekspoor wrote:
>>Again, the reason why I came up with the idea of some public headers for placeholder classes for typed sequences was to propose the user BOTH OPTIONS! (but maybe we should not).
>That's perhaps the most important question we have to answer first indeed.

This seems indeed the problem raised by Alex, Peter and John. There is a sense that strongly typed sequences are wanted, and maybe even needed. I stated somewhere that a design decision should also be guided by what the user of the framework wants. And I also realize that several of these users will be you (and maybe the only ones for a while), so we should try to please us, potential users, too. As both Alex and John clearly would not feel confortable working without strongly typed sequence classes, it seems it would be bad to prevent their use.

On the other hand, a good-for-all BCSequence object, that will blindly respond to almost any requests (to some variable extent), is wanted or at least seen as a good thing by several of us too (think WebView). Probably Alex, Koen, Peter and me. There is also some concern, including me, that there may be some limit there. And then, there is a debate over what kind of response would be appropriate for irrelevant messages sent to such a generic objec: runtime error, return nil, BCError, empty object (or self if appropriate)?

Now, a little bit of history (already!) on the recent 'class cluster' discussion, viewed from my point of view:
* It was triggered on my side by the feeling that the current code was getting a bit schizophrenic. It currently allows to instantiate viw BCSequenceFactory a weakly-typed object that will respond to the methods in BCSequence.h (at least from the compiler point of view). If this list of methods is very restrictive (only methods relevant to all types, no -complement, no -hydrophobicity,...), then you get a quite useless object. If this list of methods is large, then you have a problem: due to inheritance, the compiler will assume all the suclasses can respond to the messages; now the subclasses are useless, the compiler think they can respond to anything (hence no compiler warnings)
* then I (somewhat stupidely) assumed that the latter case was the one favored by the current design, ie a one-for-all class; this is when I thought of the class cluster being a better design in this context; I still think it is for such a one-for-all class, because the sequence tyes are still different enough that they deserve their own class;
* at the same time, to please some (yet virtual) users willing to stick to strong typing, I came up with the idea of an additional set of placeholder classes with resticted sets of methods in their headers; in the context of a class cluster, other ideas are possible; and actually, these ideas may apply to the current design too; i was thinking this could be added later anyway to please these yet non-existing users;
* then I realized yesterday that such users actually existed, and I even could see their point; so now my opinion is that we should indeed give BOTH options to the user, which will please all of us (see above why)


Going back one step, forgetting the class cluster idea for a minute, looking at the current design, and trying to think of what could be done to achieve this, a new idea came up, which looks very simple, easy to grasp for the developer and the user, and with minimal code duplication. Well, OK, maybe I exagerate a bit, I should let you decide by yourselves how good it is, and what pitfalls I am missing or subconciously hiding.

Here it is.

We keep mostly the same implementation as now. The superclass BCSequence is public (but abstract, see why below), and all the current subclasses are concrete and public (BCSequenceDNA, BCSequenceRNA,...). The superclass handles as much as possible the code that can be factored out (including annotations, though an intermediate subclass is also possible, see earlier discussions). The subclasses step in when necessary to replace the superclass methods (for optimizations, specific handling,...). Note that ALL methods should return something, regardless of the relevance (eg BCSequenceProtein should return something in response to -complement, which can done by the superclass, anyway); you will see why below; this can be achieved by having the superclass implement ALL the methods, always returning something not too stupid (-complement actually is already quite smart and in the superclass).

Now about the headers. They are all public, because the classes are all public. We only keep in the superclass BCSequence the methods that apply to all subclasses, i.e. the restrictive set of methods (no -complement, no -hydrophobicity,...). We add the appropriate methods in the appropriate subclasses (-complement in BCSequenceDNA, -hydrophobicity in BCSequenceProtein,...)

And THEN, we add another subclass, for example called BCSequenceGeneric. In the header of this subclass, we put all the methods. This will be for the user a concreate subclass with this one-for-all feel and look (hence 'generic'). And under the hood, this class is like a class cluster (ah! ah! the minute is elapsed; see above). At runtime, you don't get a BCSequenceGeneric instance, but an instance of one of the other subclasses, BCSequenceDNA, ... So no additional code is needed, it is already provided by the other classes. This new generic sequence can be used by the lover of the one-for-all class, and will automatically benefit from the implementation of the other subclass.

As a result, if you use the generic one-for-all class, you can call any method you want and always get something back, without the need to know what is going on (it is in the hands of the user of the final app). However, if you use a typed class, you get appropriate compiler warnings (no runtime error, though). Note that should never use BCSequence (in theory you could, but you would not benefit from potential optimizations in the subclasses).

I am still not sure how to fit the mutable/immutable design in this, but it seems you can't avoid NSMutableSequenceDNA et al. if you are going to have some strong typing.

What do you think?

Charles

NB: that's all for today, it is bedtime

-- 
Charles Parnot
charles.parnot at stanford.edu

Help science go fast forward:
http://cmgm.stanford.edu/~cparnot/xgrid-stanford/

Room  B157 in Beckman Center
279, Campus Drive
Stanford University
Stanford, CA 94305 (USA)

Tel +1 650 725 7754
Fax +1 650 725 8021



More information about the Biococoa-dev mailing list