[Biococoa-dev] New Structure for BioCocoa
Koen van der Drift
kvddrift at earthlink.net
Sat Jul 2 20:59:39 EDT 2005
First quick reaction: WTF - is this going to throw away all our
efforts up until now? Should I stop adding stuff to the framework until
the new structure is in place?
Second reaction: using an internal string does make a lot of sense,
especially because a lot of manipulations can be done much easier. It
probably makes it also easier to read/write text files, from databases
Now I need some time to really think about the new development ideas.
Are there more surprises from wwdc?
On Jul 2, 2005, at 7:39 PM, Charles Parnot wrote:
> Thanks Phil!
> I like the parser idea, particularly if it is already written by you
> I won't be of any help with C++, though!
> The structure you outline looks fine to me, and I am not sure why we
> should stop implementing stuff now. Clearly, if we agree to use a
> parser, we should not write code for the IO until it is ready (though
> to test the parser, the best is to use it, so the IO would probably
> grow at the same time as the parser). But the modifications in the
> sequence structure can be implemented now. I think we should simply
> define goals and have everybody make it clear what they want to
> contribute too, and have several independent lines of development that
> do not depend too much on each other and that can be done
> independently. Here is a possible roadmap, made up in 5 minutes (needs
> some refinement!):
> * get the IO to work (at least read sequences)
> * modify the sequence structure (read below) and make sure we have
> some methods that can be used by the parser to create the sequence
> (the internals of BCSequence should be as much as possible
> encapsulated and not directly accessed by the parser)
> * get the annotations up and running; the annotation issue should not
> prevent the IO from being implemented; in a first phase, the IO can
> parse the annotations but not use them; classes and methods to
> manipulate annotations can be later added to the sequence object, and
> the parser modified to add these calls.
> Now, Koen rightly complained he did not get a report of the WWDC
> meeting (and the other absent did not get it too). Here is a
> (complete?) list of the decisions/discussions we had.:
> * change the internal structure of teh sequence string in BCSequence
> (read below)
> * think about annotations
> * look at the internals of BCAnnotatedString of GNUStep to see how the
> annotations are done, because the structure of NSAnnotatedString is
> very similar to sequence annotations
> * probably not worry about performance issues with annotations;
> manipulating annotations will not happen that often, mostly when
> modifying a sequence, and generating a subsequence; the bottom line is
> we can probably stick to NSMutableDictionary (I discussed that in a
> previous email)
> * still think even more about annotations
> * better define the purpose of BioCocoa, and the programmer niche we
> are trying to target (the niche is probably us, at this point!)
> * write some code
> Regarding the sequence structure Phil mentions, I will try to explain
> it now for those of us that were not part of the discussion.
> Short version
> Replace the NSArray of BCSymbol with a char [ ]...
> Long version
> * The sequence will be stored internally as an array of char, which
> will make the performance discussions moot. A lot of the sequence
> manipulations are particularly easy to handle as strings. I don't know
> if we have decided to use an NSMutableData ivar, or do the malloc
> ourselves. Using NSData is probably a better idea, as it will already
> be optimized for
> * The public interface will expose arrays of BCSymbols. Because a
> BCSequence has always a BCSymbolSet associated with it, it is easy to
> convert between chars and BCSymbol objects on demand. All the methods
> for that are already available. The NSArray can even be cached (and
> reconstructed as needed as soon as the sequence is modified).
> * The public interface could probably have a method to return the
> array of chars as well as an autoreleased object. This is very easy
> e.g. creating an autoreleased NSData populated with a copy of the
> sequence bytes (and return either the *char or the NSData itself). The
> copy of the bytes (necessary for mutable sequences) will be fast, much
> faster than copying the NSArray (with all the useless retain/release
> of the singleton BCSymbols). So we don't have to worry about the issue
> of returning the internal array used by the sequence when the sequence
> is mutable (we only have mutable sequences at this point, but I plan
> to add immutable ones, I know, I am obsessed with that issue).
> On Jul 2, 2005, at 8:45 AM, Philipp Seibel wrote:
>> Hi all,
>> i want to start the discussion on the mailinglist, we allready
>> started at the wwdc.
>> In my point of view the BioCocoa project needs to get a modular and
>> flexible structure. The attached pdf shows my suggestion of the
>> possible new structure.
>> The next thing we have to discuss is the implementation of the
>> datastructures in the BCFoundation framework. Our wwdc-discussion
>> lead to a new string based sequence structure.
>> I think we should spend quite some time to plan the future structure
>> of BioCocoa and stop implementation until the new structure is
>> decided. We all want a 1.0 version of the framework and there are at
>> least two persons from the wwdc, who want to use BioCocoa in their
>> projects, so we should go for it. :-) (i should teach professional
>> motivation practices :-)).
>> The discussion is open .......
>> BTW: I allready startet the BCParser.framework mentioned in the
>> attached document. I think of a very flexible highlevel parser
>> framework with event driven parsers like NSXMLParser.
>> This allows easy implementation of various file formats for different
>> datastructures. Not everybody is satisfied with a biococa sequence
>> and wants to have his own structure, the parser api allows to parse
>> the files into any datastructure, and of course also into our future
>> BCFoundation structures. The api is based on the c++ boost-spirit
>> parser apis and is developed as objective-c++ framework, without any
>> dynamic linking dependancies. Just tell me what you think about it
>> Biococoa-dev mailing list
>> Biococoa-dev at bioinformatics.org
> Help science move fast forward:
> Charles Parnot
> charles.parnot at gmail.com
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
More information about the Biococoa-dev