Koen van der Drift
kvddrift at earthlink.net
Thu Oct 6 20:07:42 EDT 2005
On Oct 6, 2005, at 5:45 PM, Charles Parnot wrote:
> At this point, given that I don't know that much about the fine
> details of sequence clusters and sequence groups, could you, Peter,
> take some time to explain exactly what the concept is, and also
> maybe come up with some examples of what it can be used for and how
> a user of the framework would want to use it. This way, we can
> define a header that does the job, and then worry about the
> implementation. In fact, I should have asked that question in the
> first place instead of pretending I understood what it was all about!
me too :)
> Sorry maybe this is a quite wide question. We don't have to go too
> deep at this point, as we merely want some I/O to work. However, in
> 'I/O', there is 'O' for output, so the question is: after loading a
> sequence from disk, what information will the user want to
> retrieve? Or will she just want to perform some operations on the
> sequence group and then move on?
I agree, lets go ahead with a very basic BCSequenceGroup class that
only maintains an array of BCSequences. Actually, BCSequenceReader
now already returns an NSArray to take care of this, it would be a
slight modification of the code to have it return a BCSequenceGroup.
I just realized that eg fasta file also can contain multiple
sequences. However, when a file only contains one sequence (and the
user knows that), the name BCSequenceGroup could be misleading. Once
we have that in place, we can create subclasses of BCSequenceGroup
that take care of additional info, and sequence relations.
I just looked at bioperl, and they do something similar with their
SeqIO class (see http://bioperl.org/HOWTOs/html/SeqIO.html). But they
use a separate class to IO trees of sequences (for phylogenetics).
Not sure if that is something we should do, unless that sequence
format is completely different from all the others.
More information about the Biococoa-dev