[Biococoa-dev] Subclassing BCSequence really needed?

Koen van der Drift kvddrift at earthlink.net
Sun Oct 24 20:11:44 EDT 2004

On Oct 24, 2004, at 5:33 PM, Alexander Griekspoor wrote:

> Just a few ideas in between the praying that are website will 
> withstand being slashdotted ;-)
>> 1. Should we have just one BCSequenceReader class, which takes care 
>> of all various file formats, or should we subclass for each format? 
>> I'm not sure yet what the best solution is.
> Given the large number of different formats, I'm a bit afraid that we 
> will end up with way to many subclasses It might also be a problem to 
> have a general "find-out-what-format-it-is-and-handle-it" method, 
> although this could be handled in a superclass. I'm not sure on this 
> one either.

My gut feeling says to use a general read-sequence class that can be 
passed a format identifier:

	BCSequenceReader*	myReader = [[BCSequenceReader alloc] init];

	BCSequence *mySequence = [myReader readSequenceFromFile: aFile 
usingFormat: @"fasta"];

(this is the bioperl approach)

and additionally:

	BCSequence *mySequence = [myReader readSequenceFromFastaFile: aFile];

(this is the biojava approach)

or even:

	BCSequence *mySequence = [myReader readSequenceFromFile: aFile];

The current BCReader class has some code to determine the file format 
which we could use.

>> 2. Some formats can contain multiple seqeuences (eg fasta). Should we 
>> return an NSArray of BCSequences only in those cases, or to be more 
>> consistent for all formats? I suggest the latter.
> I love consistency, but we can have some convenience methods to get 
> only the first or a range of sequences.

Sounds good.

>> 3. Do we need an additional class that holds the BCSequence plus all 
>> other info, or should everything go into the BCSequence class?
> Good question, I would tend to go with everything in BCSequence to 
> avoid another wrapper, after all it is a property of the BCSequence...

Hmm, the additional info is probably what we call annotations and 
features. But it is very variable dependeing on the file format. I 
would keep the BCSequence class small, it's only task is to maintain a 
sequence. We could use 'decorator' objects for these situations.

- Koen.

More information about the Biococoa-dev mailing list