[Biococoa-dev] More on BCSymbolSets
Koen van der Drift
kvddrift at earthlink.net
Mon Feb 28 20:03:40 EST 2005
On Feb 28, 2005, at 6:34 PM, Charles PARNOT wrote:
> Anyway, the question at this point is: what do we want to do with
> symbolSet? If they are just a way to provide a refinement on the
> sequenceType, they we may not need a full class, but just an enum. And
> if we don't enforce the sequence contents to be consistent with the
> symbolSet, then it is useless.
The idea of the symbolSet originates from a similar approach in BioPerl
and BioJava, where they are called 'Alphabets'. Basically they can be
used in cases when a users add a 'T' to a sequence, and wants to be
sure they are a thymidine in DNA or a treonine in a protein. See also
http://www.biojava.org/tutorials/chap1.html for more background.
Although we all agree that the BioJava approach is cumbersome, I still
like the idea of using a symbolset to define which symbols are allowed
in a sequence. So it is not neccesarily a sequence identifier, but more
a filter which defines which symbols are allowed in a specific type of
sequence. Another possible reason at that time was that the symbolset
could act as a sequece identifier, and thereby removing the need to
subclass BCSequence. But that idea was not much appreciated here ;-)
> So, what do you think symbolSet should be used for? The way I see it
> now is as a filter to restrict the symbols used in a given sequence.
> In fact, the more I think about it, the more 'filtering' seems like
> what it should do. And if we don't want any restriction, then one can
> always create very broad symbol sets.
Exactly, see my point above.
> I don't know what Koen had in mind when creating the symbol set class,
> because I see a 'complementSet' method there.
This was actualy introduced by Alex. He wrote the interface, I filled
in some of the implementation. And indeed I had no idea what to do with
complementSet and a few other methods, so I have left those empty :-)
ps you guys are going *way* too fast with all those emails. I only have
a limited time each day to read them, understand them and possibly
reply to them. Sorry if I don't address all issues :(
More information about the Biococoa-dev