[Biococoa-dev] More on BCSymbolSets

Charles PARNOT charles.parnot at stanford.edu
Thu Mar 3 21:47:45 EST 2005


>I think it would be a good idea if we allow the user to pass a symbolset, defining the type of sequence. In fact you not only make a filter for whatever string or array is supplied to create the sequence, but you also have immediately an identifier of the sequence.

So we would need to provide an initializer with a symbolSet argument, e.b. 'initWithSymbolArray:symbolSet'. OK, we agree :-)

What do you mean an identifier?


>>
>>About sequenceTypes:
>>
>>* Should we extend the number of sequence types to take into account the different symbol sets?
>>Proposed by Koen.
>
>I am not in favor of extending the number of sequence types, It was more a question based on the comments made by John. Actually, I would propose to not use the sequencetype at all, but only use symbolsets, since they also act as identifiers (see above).

Still unclear how they would be identifiers of the sequence (like in unique id??).
OK, we basically agree that sequence type as it is now is not super useful, except as a shortcut for the sequence class.
I still think that a BCSequenceType has a use. A symbolSet should not be allowed to hold symbols of different types/classes. So symbolSet would have a type. And a symbolSet should be allowed to be associated with a sequence only if the right type.
Instead of checking the class all the time, it is probably better to use an enum like BCSequenceType.


>>* Will all instances of one given sequence classalways have the same sequenceType? e.g. all instances of BCDNASequence will be of type 'BCDNASequence'.
>
>Probably not. A BCSequenceDNA can have ambiguous symbols, but can also be strict. It can allow for gaps in an alignment, etc. By assigning it a sequence type, still doesn't tell anything about the possible symbols. Therefore a symbolset will be much more useful. Another thing that bugs me is that the sequence is BCSequenceDNA but the type is BCDNASequence. Very confusing :)

I agree that symbolSets will be different for each instance. But the sequenceType, if we keep it in addition of the symbolSet (for the reason above), then it will be always the same for all instances of a class.
Regarding the naming conventions, BCSequenceDNA for the class, BCDNASequence for the type, it is indeed quite confusing; how about BCSequenceTypeDNA et al.?

charles

-- 
Help science go fast forward:
http://cmgm.stanford.edu/~cparnot/xgrid-stanford/

Charles Parnot
charles.parnot at stanford.edu

Room  B157 in Beckman Center
279, Campus Drive
Stanford University
Stanford, CA 94305 (USA)

Tel +1 650 725 7754
Fax +1 650 725 8021



More information about the Biococoa-dev mailing list