[Biococoa-dev] More on BCSymbolSets
Koen van der Drift
kvddrift at earthlink.net
Thu Mar 3 21:56:05 EST 2005
On Mar 3, 2005, at 9:47 PM, Charles PARNOT wrote:
>
>> I think it would be a good idea if we allow the user to pass a
>> symbolset, defining the type of sequence. In fact you not only make a
>> filter for whatever string or array is supplied to create the
>> sequence, but you also have immediately an identifier of the
>> sequence.
>
> So we would need to provide an initializer with a symbolSet argument,
> e.b. 'initWithSymbolArray:symbolSet'. OK, we agree :-)
>
> What do you mean an identifier?
I mean the sequence type.
> OK, we basically agree that sequence type as it is now is not super
> useful, except as a shortcut for the sequence class.
> I still think that a BCSequenceType has a use. A symbolSet should not
> be allowed to hold symbols of different types/classes. So symbolSet
> would have a type.
This will be taken care of when the symbolset is created, see the
BCSymbol class. The dnaSymbolSet only holds nucleotides, the
proteinSymbolSet holds only amino acids.
> And a symbolSet should be allowed to be associated with a sequence
> only if the right type.
> Instead of checking the class all the time, it is probably better to
> use an enum like BCSequenceType.
This won't happen that much, maybe only during creation, so I don't
think there will be much slowdown by calling the class instead of the
sequenceType.
>
>>> * Will all instances of one given sequence classalways have the same
>>> sequenceType? e.g. all instances of BCDNASequence will be of type
>>> 'BCDNASequence'.
>>
>> Probably not. A BCSequenceDNA can have ambiguous symbols, but can
>> also be strict. It can allow for gaps in an alignment, etc. By
>> assigning it a sequence type, still doesn't tell anything about the
>> possible symbols. Therefore a symbolset will be much more useful.
>> Another thing that bugs me is that the sequence is BCSequenceDNA but
>> the type is BCDNASequence. Very confusing :)
>
> I agree that symbolSets will be different for each instance. But the
> sequenceType, if we keep it in addition of the symbolSet (for the
> reason above), then it will be always the same for all instances of a
> class.
> Regarding the naming conventions, BCSequenceDNA for the class,
> BCDNASequence for the type, it is indeed quite confusing; how about
> BCSequenceTypeDNA et al.?
*If* we decide to keep it, that would indeed be better, yes.
- Koen.
More information about the Biococoa-dev
mailing list