symbol class and sequence class (was: [Biococoa-dev] Annotation)

Koen van der Drift kvddrift at earthlink.net
Mon Feb 21 19:49:51 EST 2005


On Feb 21, 2005, at 1:10 PM, Charles PARNOT wrote:

>>> Also, Koen, I have one question about the symbolSet: it seems that 
>>> all instances of one sequence type use the same symbol set. Is that 
>>> right?
>> No, each sequence should have it's own symbolset, just like each 
>> sequence has its own symbolArray.
>
> Oups, I got confused. I actually meant: do you foresee that each 
> sequence class will most of the time use a constant BCSymbol subclass? 
> I am not sure what the BCSymbolSet will be used for.

The BCSymbolSet was introduced to have a similar function as the 
alphabets in BioPerl and BioJava. It can be used instead of or next to 
BCSequenceType. So basically, a sequence has a symbolset which defines 
which BCSymbols are allowed for that particular type of sequence. In 
most cases just the standard symbolsets will be used, but they can be 
extended if needed, eg to incorporate user-defined symbols. An 
advantage can be that we don't need to check everytime if the BCSymbol 
in a sequence is of the right type of BCSymbol. If we use eg the 
BCSymbolSet for DNA, then we know that an 'A' in that sequence must be 
adenosine and not alanine. So far, ony the code for BCSymbolSet has 
been written, but not yet the implementation in the rest of the 
framework. I hope it is more clearer now?


>
> I am asking this because I am still annoyed by the fact that the 
> BCAbstractSequence class does not have a designated initializer. This 
> is an issue that showed up no later than yesterday, when Koen added 
> the annotations code. You set the annotations to nil in 
> 'initWithString...' but not in 'initWithSymbolArray'... and you should 
> not have had to if the latter was calling the first, in other words if 
> it was calling the 'designated initializer'. The problem is those two 
> initializations are encoded separately and in parallel, and are also 
> encoded in all the subclasses, where the code is repeated exactly the 
> same except for the BCSymbol subclass used.
>
> I suggest we do the following:
> * each subclass implements "+ (Class)defaultSymbolClass" to return the 
> BCSymbol subclass to use;

For this purpose exactly we have introduced the BCSymbolSet, so I don't 
think the defaultSymbolClass is needed. What is needed is that I add 
the code for BCSymbolSet ;-)


- Koen.




More information about the Biococoa-dev mailing list