[Biococoa-dev] BCSequenceCluster

Thu Oct 6 20:07:42 EDT 2005

On Oct 6, 2005, at 5:45 PM, Charles Parnot wrote:

> At this point, given that I don't know that much about the fine  
> details of sequence clusters and sequence groups, could you, Peter,  
> take some time to explain exactly what the concept is, and also  
> maybe come up with some examples of what it can be used for and how  
> a user of the framework would want to use it. This way, we can  
> define a header that does the job, and then worry about the  
> implementation. In fact, I should have asked that question in the  
> first place instead of pretending I understood what it was all about!
>

  me too :)

> Sorry maybe this is a quite wide question. We don't have to go too  
> deep at this point, as we merely want some I/O to work. However, in  
> 'I/O', there is 'O' for output, so the question is: after loading a  
> sequence from disk, what information will the user want to  
> retrieve? Or will she just want to perform some operations on the  
> sequence group and then move on?
>

I agree, lets go ahead with a very basic BCSequenceGroup class that  
only maintains an array of BCSequences. Actually, BCSequenceReader  
now already returns an NSArray to take care of this, it would be a  
slight modification of the code to have it return a BCSequenceGroup.  
I just realized that eg fasta file also can contain multiple  
sequences.  However, when a file only contains one sequence (and the  
user knows that), the name BCSequenceGroup could be misleading. Once  
we have that in place, we can create subclasses of BCSequenceGroup  
that take care of additional info, and sequence relations.

I just looked at bioperl, and they do something similar with their  
SeqIO class (see http://bioperl.org/HOWTOs/html/SeqIO.html). But they  
use a separate class to IO trees of sequences (for phylogenetics).  
Not sure if that is something we should do, unless that sequence  
format is completely different from all the others.

cheers,

- Koen.