[Biococoa-dev] BCSequence implementation

Philipp Seibel biococoa at bioworxx.com
Wed Feb 23 08:57:10 EST 2005

Ok i see ...
I'm a newbee and i don't know everything jet ..... ;-)
Let me try to make another suggestion ..... ok ok possibly it was also 
there before ... but it's worth to risk i think ;-)

What about to make several Frameworks from the beginnig like it's 
organized in the Cocoa Framework.

There is the plan for a BCFoundation / BCAppKit framework i noticed.

For my HMM plans i think a BCHMM.framework would be nice ....
Only because there are so many elementary classes for HMMs, it will be 
confusing putting them all in the BCFoundation framework

what do you think ?


Am 23.02.2005 um 14:39 schrieb Alexander Griekspoor:

>> you are completely right. It was my fault. I think its nice to have 
>> some categories in one header file, but not due to performance 
>> issues.. (you are right).
> No problem, indeed it's a good idea to organize the code in categories 
> inside a single header file, it nicely groups all the related code.
>> I took a look into the BCAbstract Sequence and recognized that the 
>> Object stores the Sequence in a NSArray of BCSymbols. Thats not 
>> really good i think. Imagine handling complete genome sequences or 
>> other stuff. I think we need to store it in a NSString or even simmpe 
>> char array. There could be of course accessor methods for BCSymbols 
>> .... but we really need to care about memory and performance issues. 
>> Especially in the Foundation framework.
> Again this discussion also predates your arrival at the framework, 
> perhaps you can take a look in the archives...
> The basic thought here is that we have made the decision very 
> carefully to go for our own BCSequence and BCSymbol class (although 
> the design has recently changed quite dramatically with the arrival of 
> Charles ;-). The reason is pretty simple, although many similarities 
> an NSString is not the same as a sequence. The characters are 
> different, many features are different.
> Of course we could go for char arrays but that will basically get rid 
> of all the benefits Cocoa (and object oriented design) has to offer us 
> from the start (and thus basically kill the reason to build the 
> framework in the first place).
> By having our own BCSymbol and BCSequence we think we have an oriented 
> design mimicking NSString (but better) which is way more powerful than 
> basic c arrays can ever be. Well, I hear you think, "that's nice, but 
> my computer will never work with a genome of objects (memory and 
> speedwise)!" That correct, but therefore we are using a simple trick. 
> All symbols are so-called shared instances, meaning that only a single 
> instance is allocated and in memory, and all a sequence array consists 
> of are pointers to this one instance. Yes, this will take up more 
> memory than a char (around 4 times?) but that's more than worth the 
> benefits Cocoa will give us. And to relieve you further, yes there are 
> char and NSString accessors that will give you the desired variable if 
> you need them. But just to make sure, it's a deliberate choice that 
> ALL internal representations of sequences should be in the form of our 
> own BCSequence objects wherever possible. Ideally this includes 
> alignments. I don't think it's a good thing if a method would consist 
> of converting a BCSequence to a string, do the manipulation, and 
> reconvert the string to a BCSequence. All this should be done natively 
> in the BCSequence format, and if that gives one trouble, we should 
> rethink/extend the BCSequence class.
> Now, I do realize that with the arrival of more people, it's obvious 
> that they are gonna ask themselves and the list (no offense, please 
> do!) the questions that we asked ourselves as well during the initial 
> design of the setup we now implement. Therefore, I think that once the 
> basic BCSequence system is up and running (BCSequence et al, 
> annotations & features, and SeqIO) documentation will become the 
> number one priority. As I want to do spend a bit more (PR)  words on 
> BioCocoa on our website anyway, to generate more knowledge and 
> traffic. I'll see if I can combine that with some more explanation of 
> the basic architecture of the BCSequence setup. Until the 1.0 release 
> of BioCocoa and if anyone agrees with the idea, it can become the 
> temporarily (developer) homepage of the new BioCocoa framework, 
> leaving the current one intact as long as we're still in beta (or 
> alpha ;-) phase. Peter, any thoughts on this one?
>> Another Question:
>> What about a BCMutableSequence ... id like to implement one for the 
>> Alignment classes
> At the moment we've decided to go for a class that's mutable from the 
> beginning, mainly for both performance and technical reasons. Perhaps 
> Charles and John can talk a bit more about this, and I remember a 
> discussion about this issue, so there must be a thread in the 
> archives... It would be nice to have an optimized immutable version in 
> the future, but again Charles might explain you better why that's not 
> so easy with the current implementation, he designed the class cluster 
> approach.
> So all in all, please don't feel offended by the answers, all your 
> comments are more than welcome and the lack of documentation doesn't 
> really help starters. Feel free to ask us to explain the rational 
> behind different design choices, it will help writing the 
> documentation and FAQs for one!
> Cheers,
> Alex
> Ps. finally I don't want to sound too motherly but as a general "rule" 
> please first let the list know the plans we all have on what we will 
> do before submitting anything in the CVS (for instance the 
> BCAnnotableSequence.h/m files I noticed). This especially when you 
> would like to see folders added and/or files (simply outline your 
> proposed work in a post), then at least we know not to remove them ;-) 
> Right now my current focus is the BCAnnotation/Feature part, the 
> implementation of them in BCSequence and BCSequenceReader/Writer.
> *********************************************************
>                       ** Alexander Griekspoor **
> *********************************************************
>                 The Netherlands Cancer Institute
>                 Department of Tumorbiology (H4)
>           Plesmanlaan 121, 1066 CX, Amsterdam
>                     Tel:  + 31 20 - 512 2023
>                     Fax:  + 31 20 - 512 2029
>                    AIM: mekentosj at mac.com
>                     E-mail: a.griekspoor at nki.nl
>                 Web: http://www.mekentosj.com
>           LabAssistant - Get your life organized!
>           http://www.mekentosj.com/labassistant
> *********************************************************

More information about the Biococoa-dev mailing list