[Biococoa-dev] BCSequence implementation
Alexander Griekspoor
mek at mekentosj.com
Wed Feb 23 09:52:13 EST 2005
Well, as long as we both in the XCode project and in the file hierarchy
add a BCHMM folder, I don't have any objection against many elementary
classes in the current framework. I would like to see as little
frameworks as possible... Maybe others don't share this opinion
though... Look at the BCSequence folder, that starts to expand nicely
as well ;-)
Cocoa is made up of only two frameworks, NSFoundation and NSAppKit,
that's more or less why we kept the separation similar...
Cheers,
Alex
On 23-feb-05, at 14:57, Philipp Seibel wrote:
> Ok i see ...
> I'm a newbee and i don't know everything jet ..... ;-)
> Let me try to make another suggestion ..... ok ok possibly it was also
> there before ... but it's worth to risk i think ;-)
>
> What about to make several Frameworks from the beginnig like it's
> organized in the Cocoa Framework.
>
> There is the plan for a BCFoundation / BCAppKit framework i noticed.
>
> For my HMM plans i think a BCHMM.framework would be nice ....
> Only because there are so many elementary classes for HMMs, it will be
> confusing putting them all in the BCFoundation framework
>
> what do you think ?
>
> Phil
>
> Am 23.02.2005 um 14:39 schrieb Alexander Griekspoor:
>
>>> you are completely right. It was my fault. I think its nice to have
>>> some categories in one header file, but not due to performance
>>> issues.. (you are right).
>> No problem, indeed it's a good idea to organize the code in
>> categories inside a single header file, it nicely groups all the
>> related code.
>>
>>> I took a look into the BCAbstract Sequence and recognized that the
>>> Object stores the Sequence in a NSArray of BCSymbols. Thats not
>>> really good i think. Imagine handling complete genome sequences or
>>> other stuff. I think we need to store it in a NSString or even
>>> simmpe char array. There could be of course accessor methods for
>>> BCSymbols .... but we really need to care about memory and
>>> performance issues. Especially in the Foundation framework.
>> Again this discussion also predates your arrival at the framework,
>> perhaps you can take a look in the archives...
>> The basic thought here is that we have made the decision very
>> carefully to go for our own BCSequence and BCSymbol class (although
>> the design has recently changed quite dramatically with the arrival
>> of Charles ;-). The reason is pretty simple, although many
>> similarities an NSString is not the same as a sequence. The
>> characters are different, many features are different.
>> Of course we could go for char arrays but that will basically get rid
>> of all the benefits Cocoa (and object oriented design) has to offer
>> us from the start (and thus basically kill the reason to build the
>> framework in the first place).
>>
>> By having our own BCSymbol and BCSequence we think we have an
>> oriented design mimicking NSString (but better) which is way more
>> powerful than basic c arrays can ever be. Well, I hear you think,
>> "that's nice, but my computer will never work with a genome of
>> objects (memory and speedwise)!" That correct, but therefore we are
>> using a simple trick. All symbols are so-called shared instances,
>> meaning that only a single instance is allocated and in memory, and
>> all a sequence array consists of are pointers to this one instance.
>> Yes, this will take up more memory than a char (around 4 times?) but
>> that's more than worth the benefits Cocoa will give us. And to
>> relieve you further, yes there are char and NSString accessors that
>> will give you the desired variable if you need them. But just to make
>> sure, it's a deliberate choice that ALL internal representations of
>> sequences should be in the form of our own BCSequence objects
>> wherever possible. Ideally this includes alignments. I don't think
>> it's a good thing if a method would consist of converting a
>> BCSequence to a string, do the manipulation, and reconvert the string
>> to a BCSequence. All this should be done natively in the BCSequence
>> format, and if that gives one trouble, we should rethink/extend the
>> BCSequence class.
>>
>> Now, I do realize that with the arrival of more people, it's obvious
>> that they are gonna ask themselves and the list (no offense, please
>> do!) the questions that we asked ourselves as well during the initial
>> design of the setup we now implement. Therefore, I think that once
>> the basic BCSequence system is up and running (BCSequence et al,
>> annotations & features, and SeqIO) documentation will become the
>> number one priority. As I want to do spend a bit more (PR) words on
>> BioCocoa on our website anyway, to generate more knowledge and
>> traffic. I'll see if I can combine that with some more explanation of
>> the basic architecture of the BCSequence setup. Until the 1.0 release
>> of BioCocoa and if anyone agrees with the idea, it can become the
>> temporarily (developer) homepage of the new BioCocoa framework,
>> leaving the current one intact as long as we're still in beta (or
>> alpha ;-) phase. Peter, any thoughts on this one?
>>
>>> Another Question:
>>>
>>> What about a BCMutableSequence ... id like to implement one for the
>>> Alignment classes
>> At the moment we've decided to go for a class that's mutable from the
>> beginning, mainly for both performance and technical reasons. Perhaps
>> Charles and John can talk a bit more about this, and I remember a
>> discussion about this issue, so there must be a thread in the
>> archives... It would be nice to have an optimized immutable version
>> in the future, but again Charles might explain you better why that's
>> not so easy with the current implementation, he designed the class
>> cluster approach.
>>
>> So all in all, please don't feel offended by the answers, all your
>> comments are more than welcome and the lack of documentation doesn't
>> really help starters. Feel free to ask us to explain the rational
>> behind different design choices, it will help writing the
>> documentation and FAQs for one!
>> Cheers,
>> Alex
>>
>> Ps. finally I don't want to sound too motherly but as a general
>> "rule" please first let the list know the plans we all have on what
>> we will do before submitting anything in the CVS (for instance the
>> BCAnnotableSequence.h/m files I noticed). This especially when you
>> would like to see folders added and/or files (simply outline your
>> proposed work in a post), then at least we know not to remove them
>> ;-) Right now my current focus is the BCAnnotation/Feature part, the
>> implementation of them in BCSequence and BCSequenceReader/Writer.
>>
>> *********************************************************
>> ** Alexander Griekspoor **
>> *********************************************************
>> The Netherlands Cancer Institute
>> Department of Tumorbiology (H4)
>> Plesmanlaan 121, 1066 CX, Amsterdam
>> Tel: + 31 20 - 512 2023
>> Fax: + 31 20 - 512 2029
>> AIM: mekentosj at mac.com
>> E-mail: a.griekspoor at nki.nl
>> Web: http://www.mekentosj.com
>>
>> LabAssistant - Get your life organized!
>> http://www.mekentosj.com/labassistant
>>
>> *********************************************************
>>
>>
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
4Peaks - For Peaks, Four Peaks.
2004 Winner of the Apple Design Awards
Best Mac OS X Student Product
http://www.mekentosj.com/4peaks
*********************************************************
More information about the Biococoa-dev
mailing list