From kvddrift at earthlink.net Fri Jul 1 06:21:21 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 1 Jul 2005 06:21:21 -0400 Subject: [Biococoa-dev] SequenceIO In-Reply-To: References: <0f0c72a4058c0e08d806a15929ad91be@earthlink.net> Message-ID: <60995cf2563f30376bd9d29c8078a3c8@earthlink.net> On Jul 1, 2005, at 4:42 AM, Peter Schols wrote: >> Therefore I propose that BCSequenceReader simply returns an array of >> objects. We can either store BCSequence objects in the array or >> create some kind of wrapper for each sequence, eg a new SequenceIO >> class. Annotations and features are now handled in the BCSequence >> class, so can be added in the IO code. > > I agree. Using an NSArray of sequences is better given that we now > deal with BCSequences. However, it would still be better if the IO > methods could return an NSSequenceSet or something. This class would > of course mainly consist of an NSArray containing sequences, but it > could also add support for annotations that apply to more than one > BCSequence (in phylogeny, you often have annotations that apply to a > group of sequences). > I think that Alex and John also were talking about something similar for alignments. So that could be a good addition. - Koen. From biococoa at bioworxx.com Sat Jul 2 11:45:54 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sat, 2 Jul 2005 17:45:54 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa Message-ID: Hi all, i want to start the discussion on the mailinglist, we allready started at the wwdc. In my point of view the BioCocoa project needs to get a modular and flexible structure. The attached pdf shows my suggestion of the possible new structure. The next thing we have to discuss is the implementation of the datastructures in the BCFoundation framework. Our wwdc-discussion lead to a new string based sequence structure. I think we should spend quite some time to plan the future structure of BioCocoa and stop implementation until the new structure is decided. We all want a 1.0 version of the framework and there are at least two persons from the wwdc, who want to use BioCocoa in their projects, so we should go for it. :-) (i should teach professional motivation practices :-)). The discussion is open ....... BTW: I allready startet the BCParser.framework mentioned in the attached document. I think of a very flexible highlevel parser framework with event driven parsers like NSXMLParser. This allows easy implementation of various file formats for different datastructures. Not everybody is satisfied with a biococa sequence and wants to have his own structure, the parser api allows to parse the files into any datastructure, and of course also into our future BCFoundation structures. The api is based on the c++ boost-spirit parser apis and is developed as objective-c++ framework, without any dynamic linking dependancies. Just tell me what you think about it .... cheers, Phil -------------- next part -------------- A non-text attachment was scrubbed... Name: BCFrameworks.pdf Type: application/applefile Size: 8126 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: BCFrameworks.pdf Type: application/pdf Size: 809578 bytes Desc: not available URL: -------------- next part -------------- From charles.parnot at gmail.com Sat Jul 2 19:39:45 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sat, 2 Jul 2005 16:39:45 -0700 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: Message-ID: Thanks Phil! I like the parser idea, particularly if it is already written by you ;-) I won't be of any help with C++, though! The structure you outline looks fine to me, and I am not sure why we should stop implementing stuff now. Clearly, if we agree to use a parser, we should not write code for the IO until it is ready (though to test the parser, the best is to use it, so the IO would probably grow at the same time as the parser). But the modifications in the sequence structure can be implemented now. I think we should simply define goals and have everybody make it clear what they want to contribute too, and have several independent lines of development that do not depend too much on each other and that can be done independently. Here is a possible roadmap, made up in 5 minutes (needs some refinement!): * get the IO to work (at least read sequences) * modify the sequence structure (read below) and make sure we have some methods that can be used by the parser to create the sequence (the internals of BCSequence should be as much as possible encapsulated and not directly accessed by the parser) * get the annotations up and running; the annotation issue should not prevent the IO from being implemented; in a first phase, the IO can parse the annotations but not use them; classes and methods to manipulate annotations can be later added to the sequence object, and the parser modified to add these calls. Now, Koen rightly complained he did not get a report of the WWDC meeting (and the other absent did not get it too). Here is a (complete?) list of the decisions/discussions we had.: * change the internal structure of teh sequence string in BCSequence (read below) * think about annotations * look at the internals of BCAnnotatedString of GNUStep to see how the annotations are done, because the structure of NSAnnotatedString is very similar to sequence annotations * probably not worry about performance issues with annotations; manipulating annotations will not happen that often, mostly when modifying a sequence, and generating a subsequence; the bottom line is we can probably stick to NSMutableDictionary (I discussed that in a previous email) * still think even more about annotations * better define the purpose of BioCocoa, and the programmer niche we are trying to target (the niche is probably us, at this point!) * write some code Regarding the sequence structure Phil mentions, I will try to explain it now for those of us that were not part of the discussion. Short version ------------- Replace the NSArray of BCSymbol with a char [ ]... Long version ------------ * The sequence will be stored internally as an array of char, which will make the performance discussions moot. A lot of the sequence manipulations are particularly easy to handle as strings. I don't know if we have decided to use an NSMutableData ivar, or do the malloc ourselves. Using NSData is probably a better idea, as it will already be optimized for * The public interface will expose arrays of BCSymbols. Because a BCSequence has always a BCSymbolSet associated with it, it is easy to convert between chars and BCSymbol objects on demand. All the methods for that are already available. The NSArray can even be cached (and reconstructed as needed as soon as the sequence is modified). * The public interface could probably have a method to return the array of chars as well as an autoreleased object. This is very easy e.g. creating an autoreleased NSData populated with a copy of the sequence bytes (and return either the *char or the NSData itself). The copy of the bytes (necessary for mutable sequences) will be fast, much faster than copying the NSArray (with all the useless retain/ release of the singleton BCSymbols). So we don't have to worry about the issue of returning the internal array used by the sequence when the sequence is mutable (we only have mutable sequences at this point, but I plan to add immutable ones, I know, I am obsessed with that issue). On Jul 2, 2005, at 8:45 AM, Philipp Seibel wrote: > Hi all, > > i want to start the discussion on the mailinglist, we allready > started at the wwdc. > In my point of view the BioCocoa project needs to get a modular and > flexible structure. The attached pdf shows my suggestion of the > possible new structure. > The next thing we have to discuss is the implementation of the > datastructures in the BCFoundation framework. Our wwdc-discussion > lead to a new string based sequence structure. > I think we should spend quite some time to plan the future > structure of BioCocoa and stop implementation until the new > structure is decided. We all want a 1.0 version of the framework > and there are at least two persons from the wwdc, who want to use > BioCocoa in their projects, so we should go for it. :-) (i should > teach professional motivation practices :-)). > > The discussion is open ....... > > BTW: I allready startet the BCParser.framework mentioned in the > attached document. I think of a very flexible highlevel parser > framework with event driven parsers like NSXMLParser. > This allows easy implementation of various file formats for > different datastructures. Not everybody is satisfied with a biococa > sequence and wants to have his own structure, the parser api allows > to parse the files into any datastructure, and of course also into > our future BCFoundation structures. The api is based on the c++ > boost-spirit parser apis and is developed as objective-c++ > framework, without any dynamic linking dependancies. Just tell me > what you think about it .... > > cheers, > > Phil > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sat Jul 2 20:59:39 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 2 Jul 2005 20:59:39 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: Message-ID: <5fb2f5b58537d4ff32aa9d7357aae86b@earthlink.net> First quick reaction: WTF - is this going to throw away all our efforts up until now? Should I stop adding stuff to the framework until the new structure is in place? Second reaction: using an internal string does make a lot of sense, especially because a lot of manipulations can be done much easier. It probably makes it also easier to read/write text files, from databases and xml. Now I need some time to really think about the new development ideas. Are there more surprises from wwdc? cheers, - Koen. On Jul 2, 2005, at 7:39 PM, Charles Parnot wrote: > Thanks Phil! > > I like the parser idea, particularly if it is already written by you > ;-) > I won't be of any help with C++, though! > > The structure you outline looks fine to me, and I am not sure why we > should stop implementing stuff now. Clearly, if we agree to use a > parser, we should not write code for the IO until it is ready (though > to test the parser, the best is to use it, so the IO would probably > grow at the same time as the parser). But the modifications in the > sequence structure can be implemented now. I think we should simply > define goals and have everybody make it clear what they want to > contribute too, and have several independent lines of development that > do not depend too much on each other and that can be done > independently. Here is a possible roadmap, made up in 5 minutes (needs > some refinement!): > * get the IO to work (at least read sequences) > * modify the sequence structure (read below) and make sure we have > some methods that can be used by the parser to create the sequence > (the internals of BCSequence should be as much as possible > encapsulated and not directly accessed by the parser) > * get the annotations up and running; the annotation issue should not > prevent the IO from being implemented; in a first phase, the IO can > parse the annotations but not use them; classes and methods to > manipulate annotations can be later added to the sequence object, and > the parser modified to add these calls. > > > > Now, Koen rightly complained he did not get a report of the WWDC > meeting (and the other absent did not get it too). Here is a > (complete?) list of the decisions/discussions we had.: > * change the internal structure of teh sequence string in BCSequence > (read below) > * think about annotations > * look at the internals of BCAnnotatedString of GNUStep to see how the > annotations are done, because the structure of NSAnnotatedString is > very similar to sequence annotations > * probably not worry about performance issues with annotations; > manipulating annotations will not happen that often, mostly when > modifying a sequence, and generating a subsequence; the bottom line is > we can probably stick to NSMutableDictionary (I discussed that in a > previous email) > * still think even more about annotations > * better define the purpose of BioCocoa, and the programmer niche we > are trying to target (the niche is probably us, at this point!) > * write some code > > > Regarding the sequence structure Phil mentions, I will try to explain > it now for those of us that were not part of the discussion. > > Short version > ------------- > Replace the NSArray of BCSymbol with a char [ ]... > > > Long version > ------------ > > * The sequence will be stored internally as an array of char, which > will make the performance discussions moot. A lot of the sequence > manipulations are particularly easy to handle as strings. I don't know > if we have decided to use an NSMutableData ivar, or do the malloc > ourselves. Using NSData is probably a better idea, as it will already > be optimized for > > * The public interface will expose arrays of BCSymbols. Because a > BCSequence has always a BCSymbolSet associated with it, it is easy to > convert between chars and BCSymbol objects on demand. All the methods > for that are already available. The NSArray can even be cached (and > reconstructed as needed as soon as the sequence is modified). > > * The public interface could probably have a method to return the > array of chars as well as an autoreleased object. This is very easy > e.g. creating an autoreleased NSData populated with a copy of the > sequence bytes (and return either the *char or the NSData itself). The > copy of the bytes (necessary for mutable sequences) will be fast, much > faster than copying the NSArray (with all the useless retain/release > of the singleton BCSymbols). So we don't have to worry about the issue > of returning the internal array used by the sequence when the sequence > is mutable (we only have mutable sequences at this point, but I plan > to add immutable ones, I know, I am obsessed with that issue). > > > On Jul 2, 2005, at 8:45 AM, Philipp Seibel wrote: > >> Hi all, >> >> i want to start the discussion on the mailinglist, we allready >> started at the wwdc. >> In my point of view the BioCocoa project needs to get a modular and >> flexible structure. The attached pdf shows my suggestion of the >> possible new structure. >> The next thing we have to discuss is the implementation of the >> datastructures in the BCFoundation framework. Our wwdc-discussion >> lead to a new string based sequence structure. >> I think we should spend quite some time to plan the future structure >> of BioCocoa and stop implementation until the new structure is >> decided. We all want a 1.0 version of the framework and there are at >> least two persons from the wwdc, who want to use BioCocoa in their >> projects, so we should go for it. :-) (i should teach professional >> motivation practices :-)). >> >> The discussion is open ....... >> >> BTW: I allready startet the BCParser.framework mentioned in the >> attached document. I think of a very flexible highlevel parser >> framework with event driven parsers like NSXMLParser. >> This allows easy implementation of various file formats for different >> datastructures. Not everybody is satisfied with a biococa sequence >> and wants to have his own structure, the parser api allows to parse >> the files into any datastructure, and of course also into our future >> BCFoundation structures. The api is based on the c++ boost-spirit >> parser apis and is developed as objective-c++ framework, without any >> dynamic linking dependancies. Just tell me what you think about it >> .... >> >> cheers, >> >> Phil >> >> >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > From charles.parnot at gmail.com Sun Jul 3 00:36:28 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sat, 2 Jul 2005 21:36:28 -0700 Subject: Fwd: [Biococoa-dev] New Structure for BioCocoa References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: On Jul 2, 2005, at 5:59 PM, Koen van der Drift wrote: > First quick reaction: WTF - is this going to throw away all our > efforts up until now? Should I stop adding stuff to the framework > until the new structure is in place? > Actually, the changes will not be that big. The whole BCSymbolSet concept will work really well with that structure. TO the outside, the classes will still expose mostly objects. > Second reaction: using an internal string does make a lot of sense, > especially because a lot of manipulations can be done much easier. > It probably makes it also easier to read/write text files, from > databases and xml. > And also integration with CoreData will be smoother. The string attribute will do. > Now I need some time to really think about the new development > ideas. Are there more surprises from wwdc? > > I tried to list all the things we talked about! Maybe I forgot something??? -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From biococoa at bioworxx.com Sun Jul 3 06:12:22 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 12:12:22 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: Message-ID: Hey Charles, Am 03.07.2005 um 01:39 schrieb Charles Parnot: > Thanks Phil! > > I like the parser idea, particularly if it is already written by > you ;-) > I won't be of any help with C++, though! this is no problem, because there is no much need of c++ except from expression templates. The parser isn't allready written by me sorry ;-), but i'm working on it. I'd like to know the different sequence formats i should implement, that i can finish the parsers for the sequence io first. > The structure you outline looks fine to me, and I am not sure why > we should stop implementing stuff now. Clearly, if we agree to use > a parser, we should not write code for the IO until it is ready > (though to test the parser, the best is to use it, so the IO would > probably grow at the same time as the parser). But the > modifications in the sequence structure can be implemented now. I > think we should simply define goals and have everybody make it > clear what they want to contribute too, and have several > independent lines of development that do not depend too much on > each other and that can be done independently. Here is a possible > roadmap, made up in 5 minutes (needs some refinement!): > * get the IO to work (at least read sequences) > * modify the sequence structure (read below) and make sure we have > some methods that can be used by the parser to create the sequence > (the internals of BCSequence should be as much as possible > encapsulated and not directly accessed by the parser) > * get the annotations up and running; the annotation issue should > not prevent the IO from being implemented; in a first phase, the IO > can parse the annotations but not use them; classes and methods to > manipulate annotations can be later added to the sequence object, > and the parser modified to add these calls. Sounds good to me, just wanted to make clear that we don't do any work, we can't use in the future. > > > Now, Koen rightly complained he did not get a report of the WWDC > meeting (and the other absent did not get it too). Here is a > (complete?) list of the decisions/discussions we had.: > * change the internal structure of teh sequence string in > BCSequence (read below) > * think about annotations > * look at the internals of BCAnnotatedString of GNUStep to see how > the annotations are done, because the structure of > NSAnnotatedString is very similar to sequence annotations > * probably not worry about performance issues with annotations; > manipulating annotations will not happen that often, mostly when > modifying a sequence, and generating a subsequence; the bottom line > is we can probably stick to NSMutableDictionary (I discussed that > in a previous email) > * still think even more about annotations > * better define the purpose of BioCocoa, and the programmer niche > we are trying to target (the niche is probably us, at this point!) > * write some code > > > Regarding the sequence structure Phil mentions, I will try to > explain it now for those of us that were not part of the discussion. > > Short version > ------------- > Replace the NSArray of BCSymbol with a char [ ]... > > > Long version > ------------ > > * The sequence will be stored internally as an array of char, which > will make the performance discussions moot. A lot of the sequence > manipulations are particularly easy to handle as strings. I don't > know if we have decided to use an NSMutableData ivar, or do the > malloc ourselves. Using NSData is probably a better idea, as it > will already be optimized for > > * The public interface will expose arrays of BCSymbols. Because a > BCSequence has always a BCSymbolSet associated with it, it is easy > to convert between chars and BCSymbol objects on demand. All the > methods for that are already available. The NSArray can even be > cached (and reconstructed as needed as soon as the sequence is > modified). > > * The public interface could probably have a method to return the > array of chars as well as an autoreleased object. This is very easy > e.g. creating an autoreleased NSData populated with a copy of the > sequence bytes (and return either the *char or the NSData itself). > The copy of the bytes (necessary for mutable sequences) will be > fast, much faster than copying the NSArray (with all the useless > retain/release of the singleton BCSymbols). So we don't have to > worry about the issue of returning the internal array used by the > sequence when the sequence is mutable (we only have mutable > sequences at this point, but I plan to add immutable ones, I know, > I am obsessed with that issue). Very good summary of the discussion. I think we should try to implement the string thing in different ways and test the performance, to see which one is the best. cheers, Phil > On Jul 2, 2005, at 8:45 AM, Philipp Seibel wrote: > > >> Hi all, >> >> i want to start the discussion on the mailinglist, we allready >> started at the wwdc. >> In my point of view the BioCocoa project needs to get a modular >> and flexible structure. The attached pdf shows my suggestion of >> the possible new structure. >> The next thing we have to discuss is the implementation of the >> datastructures in the BCFoundation framework. Our wwdc-discussion >> lead to a new string based sequence structure. >> I think we should spend quite some time to plan the future >> structure of BioCocoa and stop implementation until the new >> structure is decided. We all want a 1.0 version of the framework >> and there are at least two persons from the wwdc, who want to use >> BioCocoa in their projects, so we should go for it. :-) (i should >> teach professional motivation practices :-)). >> >> The discussion is open ....... >> >> BTW: I allready startet the BCParser.framework mentioned in the >> attached document. I think of a very flexible highlevel parser >> framework with event driven parsers like NSXMLParser. >> This allows easy implementation of various file formats for >> different datastructures. Not everybody is satisfied with a >> biococa sequence and wants to have his own structure, the parser >> api allows to parse the files into any datastructure, and of >> course also into our future BCFoundation structures. The api is >> based on the c++ boost-spirit parser apis and is developed as >> objective-c++ framework, without any dynamic linking dependancies. >> Just tell me what you think about it .... >> >> cheers, >> >> Phil >> >> >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sun Jul 3 07:45:12 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 07:45:12 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: Message-ID: On Jul 3, 2005, at 6:12 AM, Philipp Seibel wrote: >> I like the parser idea, particularly if it is already written by you >> ;-) >> I won't be of any help with C++, though! > this is no problem, because there is no much need of c++ except from > expression templates. > The parser isn't allready written by me sorry ;-), but i'm working on > it. > > I'd like to know the different sequence formats i should implement, > that i can finish the parsers for the sequence io first. Check out BCSequenceReader and BCReader (deprecated, but still in the project for reference) for some formats, just note that these methods still need a lot of work, annotations and features are still not completely implemented. Note that there a re already various parsers out there that could be of help, eg lucegene, bioperl and more. What I don't understand yet is why the sequence parser framework is at the bottom of your picture. This implies to me that it is the basis for all the other frameworks? Why not put BCFoundation with BCSymbol and BCSequence there, that seems much more logical to me. > Very good summary of the discussion.? > I think we should try to implement the string thing in different ways > and test the performance, to see which one is the best. > Maybe you can already post some sample code here, so we can get an idea what you have in mind. cheers, - Koen. From mek at mekentosj.com Sun Jul 3 07:46:38 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 3 Jul 2005 13:46:38 +0200 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: I hope you are still hanging in there. Now, how do we implement our "char array containing sequence objects"? There are a number of things that pop up in my mind, which I haven't really thought about much or have a solution for. But I just bring them up. I suggest everyone to do the same so when get an idea where the problems are, and what we have to think about. - The singleton system remains almost unaltered, I think it works really nice and has the properties that we want to present to the outside world right? - What about symbolsets and the creation of new sequences? - The class cluster design of sequence objects. Is it still necessary, is it correct, flexible? As said, phil and I had problems with it, and phil was not sure if it was implemented fully correct, but he'll chime in I guess. Purely from complexity towards novel users (and people like me who had to get back in again ;-), I would like to see it disappear if possible. - Do we need subclasses for certain sequence types like proteins, dna, rna? I guess yes. Do we need a class cluster for this or does simple subclassing do? - How to handle the main char array? An NSData object like charles suggested, or do-it-your-self management? - How accessible is the c-array from the outside? Read only? It must be readable at least so for instance an alignment object can use the string for its work (? la nsstring's getCString method). I would not encourage direct changing of the char array from within another object, for instance to prevent out-of-sync problems with annotation positions. For that we have things like -insertSequence:atPosition: etc - Is this the time to also have a mutable and immutable version? Charles will say yes I guess ;-) All editing methods on the array could be added only to the mutable form. Editing in general is something we need more c experts for, how to insert symbols for instance and handle it memory-wise? - What about sequence sets, already brought up for alignments: wrapper objects around multiple objects, containing metadata (annotations and friends) for the group. Even more powerful would be column editing (editing all sequences at once, coordinated by this object) and position specific (column specific) annotations. Another thing would be file export, turn for instance a sequence set into one fasta file. - Speaking of FileIO, I guess that has to be implemented anyway, and depends on how we do the initialization of our bcsequence objects. The interface to BCSequence should not necessarily be very different than it is now. Only internally things are organized differently. Finally the topic of recent days: - Phil's diagram. Yes, great structure! One of the other decisions discussed was indeed to have a modular structure, with a simple core (centered around BCSequence and the IO part) and all other things added as optional frameworks, like Phil indicated. I like the parser very much, but agree with Charles that we don't have to stop implementing things, the idea of cocoa are these blackboxes that you can internally change without altering the interface others see. But I do think we have to decide on the new BCSequence implementation first, as it can have really big effects (like disposing the class cluster potentially). Again, we don't have to throw away all things we did! - BCObject vs BCStructuralObject. The latter is a nice suggestion Koen, BCObject indeed suggest a mother-of-all object and that it is not here. Another suggestion would be BCPhysicalObject (they have a mass, weight etc) - Annotations. Together with the basic sequence, the file IO, this is the most important part of the framework. These three will make the new BioCocoa's core framework on which everything else will be build. It's indeed a difficult thing to design right, but I'm really enthusiastic about the discussions! Don't pay to much attention to my initial attempt, but it's a start. Thanks for keeping BioCocoa going Koen, great job! Again, I'm sorry to not have jumped in earlier, and I hope you don't have the feeling to be let outside of everything, all the above is not a definitive thing at all. Looking forward to everyone's reaction! Cheers, Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* From mek at mekentosj.com Sun Jul 3 07:55:39 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 3 Jul 2005 13:55:39 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa part I In-Reply-To: References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: Hi guys, My apologies for not having jumped in earlier, certainly towards Koen and John I'm sorry, I should have given a summary of the WWDC meeting much earlier. I understand that all of this comes out of the blue, but understand us well, this is all still open for debate. In fact I hope that we will discuss things more elaborate on the list in order to come up with the best implementation. I'll try to summarise the topics discussed at the WWDC and the thoughts behind them below. The story begins while Phil and I were preparing the slides for our small presentation planned on Wednesday evening. I have to admit that I had spend very few time on BioCocoa in the month before and did not have the exact structure in my head anymore. When I started to look again at our implementation it was not really trivial how the sequence class cluster was set up and also Phil had problems getting the exact idea. John did a wonderful job explaining many of the ideas in his document he created just before the WWDC, but still I think it needs re-consideration. As even developers of the framework can't get it easily, imagine new users. So we decided to not spend much time during the presentation on the implementation, both because it's a moving target still and also because we thought that it would not be of particular interest to the audience. We did decide to tell about our biojava like approach for singleton BCSymbol objects. That pattern is easy to explain and easy to get. Our main focus however was on the things we had in mind with the framework, the potential use, and the question for feedback and input. What needs can it fulfil and what are people looking for? Partially due to the rescheduled apple design awards (hooray for Peter!) we had a fairly small group of listeners, but already the discussion with the group was worth coming together I think. It was clear that most "new" people were from fields that focused on large scale genomics projects, clearly a different "target audience" than our frameworks aims at. If I maybe so blunt, I think it's safe to say that initially we aim at developers like ourselves, who create fairly small applications with many standard (and fairly simple) sequence editing routines on small sized sequences. Of course, we should aim at expanding this levels way higher, but that's not our initial goal right? One of the guys in the public explained that the philosophy behind BioJava was actually opposite, aimed at large scale genome- sized sequences, mainly focused on annotations. It was even difficult to convince the guy that there was a need for something we do! I told him that there clearly is a need for programs like vector-nti, which he agreed with in the end. Not suprisingly, the main topic of discussion quickly turned to performance, with the basic question: where do we place the border between objects and structures. We want a cocoa-like interface and ease of use, but also performance in terms of speed and memory footprint. Ideally we would like to have something like NSString, which is easy to use, has many convenient methods, but works fast because of under-the-hood implementation that uses different c structures based on the type of string you use. Now the problem is that we have to design that under the hood part of our sequence objects. Initially we choose for the BioJava approach of singleton objects (yes, I was(/am) a great fan). Let me summarize the benefits: - Objects! Powerful methods, easy accessible properties, etc. all the nice goodies from cocoa - Way more powerful than a simple char - Singleton objects to dramatically reduce memory footprint, a sequence is simply a list of pointers to the singleton objects. However there are clear negatives as well, many discussed before: - Objects! Bigger than char, not that much but still. Storing 200Mb of sequence or 4-8 times as much makes a difference! The singleton do make it dramatically different though, and I still consider this one of the smallest problems. - Speed. Object messaging is the number one problem here, requiring all kinds of hacks and tricks to get decent performance. The main problem lies in the use of NSArray and alike to store the list of pointers to the symbols. Although very convenient for editing, this kills performance. Certainly when the most frequent operation with sequences is iteration over the array. In conclusion, the singleton symbols are great! But the problem lies in the NSArray way of storing the sequence of them! Now is there a better solution? Well one obvious theme brought up many times was the old trick to convert the sequence object to a string, do the stuff that needs to be done, and convert the result back to a sequence object. The benefits are easy to see: chars are smaller and speedier to work with, and another plus: many algorithms are available for strings already. We also realized that this was something that would often be needed, thus needed a general sequence- to-string-and-back implementation. Why not? It's slow, even more slowdowns! true, the conversion time would often be neglect-able compared to the actual implementation, still it would take time. I always opposed quite strongly against all this if I could. The idea was simple, if we go for a certain implementation we should eat our own dog food, it should be so good that it would be able to handle the problems described. Alignments should work natively with BCSequences, reversing should etc. I realized that that was an illusion, and not practical. But now, I realize even more that this indeed tells us that we were on the wrong track! Our BCSequences could not be used for this, they're not suited for most of the tasks they should perform! We need another implementation. The credits have to go to Jeff, a graduate student new to BioCocoa and who I hope will join the project one day. But from all above it should be obvious what to do. We should use strings (or char arrays to be more precise). Now to quote Koen: WTF are we throwing away all the things we did in the past months? No, absolutely not. The idea is simple. The native way of storing the sequence INSIDE a BCSequence object should not be an NSArray of pointers to symbols, but would be a char array (or NSData object as Charles suggested, but lets skip the implementation for now and focus on the idea). The BCSequence object would become a wrapper object around the string as "data store". The benefits are easy to see: - size is as compact as possible, one could even think of applying classical compression algorithms to make them even smaller. - the string is always available to any implementation so: - no conversion needed, the string is always there - speed, all implementations work with strings, no iterations over ns/cfarrays - we can use all existing and standard string based algorithms, i.e. for alignments, but also for instance standard regular expression libraries for searching, matching, etc. However, to the OUTSIDE world we ARE (or perhaps better SEEM) arrays of singleton objects. If the sequence is asked for the symbol at position 18 for instance, we return the singleton object. If they want a subsequence however, we again return a bcsequence which internally has its char array of course. If you think about the number of times you really want the symbol and not for instance a sequence, range or annotation, that's not many I think. The really only downside I think is the fact that programming the implementations using strings is somewhat more complex, more c less cocoa, more pointer fiddling, less enumerators. But since in many occasions we already started that to "hack" things faster, and already opted to do the conversions necessary to get at that point, I guess it's not a problem so much. In fact, we can now use many standard char implementations already available (and tested). Of course, if speed is not an issue we can still do it the old way because there is still a way to get the pointer to the symbol for any position. So far the theory, now part II: implementing the thing.... ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Sun Jul 3 08:06:18 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 14:06:18 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: Message-ID: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> Am 03.07.2005 um 13:45 schrieb Koen van der Drift: > On Jul 3, 2005, at 6:12 AM, Philipp Seibel wrote: > > >>> I like the parser idea, particularly if it is already written by >>> you ;-) >>> I won't be of any help with C++, though! >>> >> this is no problem, because there is no much need of c++ except >> from expression templates. >> The parser isn't allready written by me sorry ;-), but i'm working >> on it. >> >> I'd like to know the different sequence formats i should >> implement, that i can finish the parsers for the sequence io first. >> > > Check out BCSequenceReader and BCReader (deprecated, but still in > the project for reference) for some formats, just note that these > methods still need a lot of work, annotations and features are > still not completely implemented. Note that there a re already > various parsers out there that could be of help, eg lucegene, > bioperl and more. > > What I don't understand yet is why the sequence parser framework is > at the bottom of your picture. This implies to me that it is the > basis for all the other frameworks? Why not put BCFoundation with > BCSymbol and BCSequence there, that seems much more logical to me. In terms of importance you are right, but i wanted to show that you can't use BCFoundation without BCParser, because the BCFoundation-IO depends on the parsers. The same with all other BCXXXFrameworks. > >> Very good summary of the discussion. >> I think we should try to implement the string thing in different >> ways and test the performance, to see which one is the best. >> >> > > Maybe you can already post some sample code here, so we can get an > idea what you have in mind. Just give me one more week. I started the project a couple of days before. Here is a short description: The base class is a BCParser, that can be instantiated with NSData or contentsOfURL etc. (just like NSXMLParser). The parser has a event driven architecture, which means that it calls delegate methods whenever it founds specific information in the parsed data (NSXMLParser ;-)). Errors are reported with the NSError class. Here is a first unfinished example of the BCGenBankParser delegate methods: @interface NSObject (BCGenBankParserDelegateMethods) - (void)parserDidBeginSequence:(BCParser *)parser; // General Information - (void)parser:(BCParser *)parser foundDefinition:(NSString *) definition; - (void)parser:(BCParser *)parser foundAccession:(NSString *)accession; - (void)parser:(BCParser *)parser foundKeywords:(NSArray *)keywords; // References - (void)parser:(BCParser *)parser didBeginReferenceForRange:(NSRange) range; - (void)parser:(BCParser *)parser foundReferenceAuthors:(NSArray *) keywords; - (void)parser:(BCParser *)parser foundReferenceTitle:(NSString *)title; - (void)parser:(BCParser *)parser foundReferenceJournal:(NSString *) journal; - (void)parserDidEndReference:(BCParser *)parser; - (void)parser:(BCParser *)parser foundSequence:(NSString *)sequence; - (void)parserDidEndSequence:(BCParser *)parser; @end thats it for the moment. feel free to comment the concept. cheers, Phil > cheers, > > - Koen. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Sun Jul 3 08:52:15 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 14:52:15 +0200 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: <1F61C957-DA34-4BA0-AD4B-17A066FB251E@bioworxx.com> Hey Alex, nice to hear something from the netherlands :-). Am 03.07.2005 um 13:46 schrieb Alexander Griekspoor: > - The class cluster design of sequence objects. Is it still > necessary, is it correct, flexible? As said, phil and I had > problems with it, and phil was not sure if it was implemented fully > correct, but he'll chime in I guess There is a structural failure in our implementation, the user thinks he will get a BCSequence object, when he calls a init or convenient method of BCSequence, but he gets a BCAbstractSequence. So we have to fix the inheritance model. > . Purely from complexity towards novel users (and people like me > who had to get back in again ;-), I would like to see it disappear > if possible. thanks for mention the class cluster problem. Here is a short description of my understanding of a class cluster: A class cluster like NSNumber is intransparent to the user of the framework. The user just knows the NSNumber class, but the functionality and content of the class depends on the constructor and initialization of the class. NSDoubleNumber, NSIntegerNumber etc have all private headers, so the user doesn't know the class exists. ---> In our implementation is all visible to the user... (public headers for BCSequenceProtein etc.) I think we should go for a simple superclass class structure not to confuse the users of the framework. cheers, Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sun Jul 3 13:14:23 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 13:14:23 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: <1F61C957-DA34-4BA0-AD4B-17A066FB251E@bioworxx.com> References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> <1F61C957-DA34-4BA0-AD4B-17A066FB251E@bioworxx.com> Message-ID: <53dfbd287df09d7d3433ef7dbfb55ed8@earthlink.net> On Jul 3, 2005, at 8:52 AM, Philipp Seibel wrote: > Hey Alex, > > nice to hear something from the netherlands :-). > > Am 03.07.2005 um 13:46 schrieb Alexander Griekspoor: > >> - The class cluster design of sequence objects. Is it still >> necessary, is it correct, flexible? As said, phil and I had problems >> with it, and phil was not sure if it was implemented fully correct, >> but he'll chime in I guess > > There is a structural failure in our implementation, the user thinks > he will get a BCSequence object, when he calls a init or convenient > method of BCSequence, but he gets a BCAbstractSequence. So we have to > fix the inheritance model. > >> . Purely from complexity towards novel users (and people like me who >> had to get back in again ;-), I would like to see it disappear if >> possible. > > thanks for mention the class cluster problem. Here is a short > description of my understanding of a class cluster: > > A class cluster like NSNumber is intransparent to the user of the > framework. The user just knows the?NSNumber class, but the > functionality and content of the class depends on the constructor and > initialization of the class. NSDoubleNumber,?NSIntegerNumber etc have > all private headers, so the user doesn't know the class exists. > > ---> In our implementation is all visible to the user... (public > headers for BCSequenceProtein etc.) > > I think we should go for a simple superclass class structure not to > confuse the users of the framework. > > It's even confusing to us too, there are still places that directly call BCSequenceDNA, BCSequenceProtein, etc. I've said it before, but I think there should only one sequence class, no matter what type of sequence type we are dealing with. How we implement the different sequence types is open for discussion. A class cluster is fine, but then all headers should be private as Phil mentiones above. I guess this is a good time to revive this discussion. Most of you know my personal preference (the BioJava approach), but other solutions can also be very usable. cheers, - Koen. From kvddrift at earthlink.net Sun Jul 3 13:22:43 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 13:22:43 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa part I In-Reply-To: References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: On Jul 3, 2005, at 7:55 AM, Alexander Griekspoor wrote: > Hi guys, > > My apologies for not having jumped in earlier, certainly towards Koen > and John I'm sorry, I should have given a summary of the WWDC meeting > much earlier. I understand that all of this comes out of the blue, but > understand us well, this is all still open for debate. In fact I hope > that we will discuss things more elaborate on the list in order to > come up with the best implementation. I'll try to?summarise the topics > discussed at the WWDC and the thoughts behind them below. Thanks Alex, for compiling this summary. It's good to see lively and hopefully fruitful discussion again on the list. > The really only downside I think is the fact that programming the > implementations using strings is somewhat more complex, more c less > cocoa, more pointer fiddling, less enumerators. But since in many > occasions we already started that to "hack" things faster, and already > opted to do the conversions necessary to get at that point, I guess > it's not a problem so much. In fact, we can now use many standard char > implementations already available (and tested). Of course, if speed is > not an issue we can still do it the old way because there is still a > way to get the pointer to the symbol for any position. I really like the idea of using an internal char[] for storing the sequence internally, it going to help a lot with implementing a lot of algorithms. Since we are using ObjectiveC, this all comes with the package, so no need to feel you are hacking things. And probably no more need for a BCScanner :) cheers, - Koen. From kvddrift at earthlink.net Sun Jul 3 13:37:54 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 13:37:54 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> Message-ID: <620627f2298a3166e5073f263011895d@earthlink.net> On Jul 3, 2005, at 8:06 AM, Philipp Seibel wrote: > In terms of importance you are right, but i wanted to show that you > can't use BCFoundation without BCParser, because the BCFoundation-IO > depends on the parsers. The same with all other BCXXXFrameworks. > Then the design is wrong :) For me BCFoundation should be the foundation, and only depend on Cocoa's Foundation framework. Why not move the IO classes to one of the BCXXXFrameworks? cheers, - Koen. From biococoa at bioworxx.com Sun Jul 3 14:00:22 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 20:00:22 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <620627f2298a3166e5073f263011895d@earthlink.net> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> Message-ID: <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> Am 03.07.2005 um 19:37 schrieb Koen van der Drift: > > On Jul 3, 2005, at 8:06 AM, Philipp Seibel wrote: > > >> In terms of importance you are right, but i wanted to show that >> you can't use BCFoundation without BCParser, because the >> BCFoundation-IO depends on the parsers. The same with all other >> BCXXXFrameworks. >> >> > > Then the design is wrong :) For me BCFoundation should be the > foundation, and only depend on Cocoa's Foundation framework. Why > not move the IO classes to one of the BCXXXFrameworks? No problem for me, but this decision will lead to a many many frameworks: BCParser BCFoundation BCFoundationIO BCXXXFramework BCXXXFrameworkIO but thats ok for me. cheers, Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sun Jul 3 14:14:15 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 14:14:15 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> Message-ID: <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> On Jul 3, 2005, at 2:00 PM, Philipp Seibel wrote: > > Am 03.07.2005 um 19:37 schrieb Koen van der Drift: > >> >> On Jul 3, 2005, at 8:06 AM, Philipp Seibel wrote: >> >> >>> In terms of importance you are right, but i wanted to show that you >>> can't use BCFoundation without BCParser, because the BCFoundation-IO >>> depends on the parsers. The same with all other BCXXXFrameworks. >>> >>> >> >> Then the design is wrong :) For me BCFoundation should be the >> foundation, and only depend on Cocoa's Foundation framework. Why not >> move the IO classes to one of the BCXXXFrameworks? > No problem for me, but this decision will lead to a many many > frameworks: > > BCParser > > BCFoundation > BCFoundationIO > > BCXXXFramework > BCXXXFrameworkIO > > What's the difference between BCFoundationIO and BCXXXFrameworkIO? - Koen. From biococoa at bioworxx.com Sun Jul 3 14:23:30 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 20:23:30 +0200 Subject: Fwd: [Biococoa-dev] New Structure for BioCocoa References: <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> Message-ID: <3370571B-F4A9-45B7-9F34-0D73988A73E9@bioworxx.com> Again to the mailinglist :-) > >> What's the difference between BCFoundationIO and BCXXXFrameworkIO? >> > > The BCParser.framework also contains Parsers for HMMs, Phylogenetic > structures etc, the BCFoundationIO only uses the Parsers for basic > structures. > If we would put all our io in one framework, the framework will > depend on all frameworks (BCFoundation and all BCXXXFrameworks)---- > > not a really good design > > Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Sun Jul 3 14:41:47 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 20:41:47 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <23098b19f75b3cded02936047eb24af7@earthlink.net> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> Message-ID: Am 03.07.2005 um 20:30 schrieb Koen van der Drift: > > On Jul 3, 2005, at 2:22 PM, Philipp Seibel wrote: > > >> The BCParser.framework also contains Parsers for HMMs, >> Phylogenetic structures etc, the BCFoundationIO only uses the >> Parsers for basic structures. >> If we would put all our io in one framework, the framework framework == BCFoundationIO (probably solves misunderstanding :-)) >> will depend on all frameworks (BCFoundation and all >> BCXXXFrameworks)----> not a really good design >> > > What I probably do not yet understand is why in the last case the > framework (I assume you mean BioCocoa) will depend on all > BCXXXFrameworks. The idea is like the cocoa framework. The cocoa framework header just consists of #import #import but you can of course only use one of it's components The BioCocoa framework header will contain #import #import #import #import #import < BCXXXIO/ BCXXXIO.h> and you can also use all Frameworks seperately. This allows you only choose the frameworks u really need in your application. Phil From kvddrift at earthlink.net Sun Jul 3 14:58:58 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 14:58:58 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> Message-ID: I understand the Cocoa framework structure. But I still don't get *why* the IO seems to be intertwined with the other code (or maybe I am confused by the name BCFoundationIO, BCSequenceIO could be better). To me IO code should just be another framework a user can add. Whatever is done with the outcome from the IO should be independent of the rest of the framework (execpt of course BCFoundation). - Koen. On Jul 3, 2005, at 2:41 PM, Philipp Seibel wrote: > > Am 03.07.2005 um 20:30 schrieb Koen van der Drift: > >> >> On Jul 3, 2005, at 2:22 PM, Philipp Seibel wrote: >> >> >>> The BCParser.framework also contains Parsers for HMMs, Phylogenetic >>> structures etc, the BCFoundationIO only uses the Parsers for basic >>> structures. >>> If we would put all our io in one framework, the framework > > framework == BCFoundationIO (probably solves misunderstanding :-)) > >>> will depend on all frameworks (BCFoundation and all >>> BCXXXFrameworks)----> not a really good design >>> >> >> What I probably do not yet understand is why in the last case the >> framework (I assume you mean BioCocoa) will depend on all >> BCXXXFrameworks. > > The idea is like the cocoa framework. The cocoa framework header just > consists of > > #import > #import > > but you can of course only use one of it's components > > The BioCocoa framework header will contain > > #import > > #import > #import > > #import > #import < BCXXXIO/ BCXXXIO.h> > > and you can also use all Frameworks seperately. This allows you only > choose the frameworks u really need in your application. > > Phil > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > From biococoa at bioworxx.com Sun Jul 3 15:12:20 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sun, 3 Jul 2005 21:12:20 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> Message-ID: <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> Am 03.07.2005 um 20:58 schrieb Koen van der Drift: > I understand the Cocoa framework structure. But I still don't get > *why* the IO seems to be intertwined with the other code (or maybe > I am confused by the name BCFoundationIO, BCSequenceIO could be > better). To me IO code should just be another framework a user can > add. Whatever is done with the outcome from the IO should be > independent of the rest of the framework (execpt of course > BCFoundation). Sorry, but i got lost :-). I just meant that we can't put all io of all Frameworks in one IOFramework, because this IOFramework will then depend on all Frameworks ( Foundation ,HMM, Phylogenetics, etc.) because it will use the structure classes of all of them. To solve this problem, we will need a seperate IOFramework for each of the Frameworks. Phil From kvddrift at earthlink.net Sun Jul 3 19:20:42 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 19:20:42 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> Message-ID: <722090db0a17da57d2a07a5550c40dfd@earthlink.net> On Jul 3, 2005, at 3:12 PM, Philipp Seibel wrote: > Sorry, but i got lost :-). I just meant that we can't put all io of > all Frameworks in one IOFramework, because this IOFramework will then > depend on all Frameworks ( Foundation ,HMM, Phylogenetics, etc.) > because it will use the structure classes of all of them. To solve > this problem, we will need a seperate IOFramework for each of the > Frameworks. > Aha, but now I understand :) I didn't realize that you were assuming different data structures for each data format. Maybe we should put some effort in creating a general internal data format that will also allow easy exchange between the various formats. cheers, - Koen. From kvddrift at earthlink.net Sun Jul 3 20:58:19 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 20:58:19 -0400 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <722090db0a17da57d2a07a5550c40dfd@earthlink.net> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> <722090db0a17da57d2a07a5550c40dfd@earthlink.net> Message-ID: <48b4608044432599a7819a3ae9d7f6b4@earthlink.net> > Aha, but now I understand :) I didn't realize that you were assuming > different data structures for each data format. Maybe we should put > some effort in creating a general internal data format that will also > allow easy exchange between the various formats. > Just as a follow up, have a look how the bioperl folks approach this: and . Not that I am implying that should be the way to do it, but as far as I can see, they just use one general IO class that takes care off all the IO and conversions. - Koen. From kvddrift at earthlink.net Sun Jul 3 21:58:16 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 3 Jul 2005 21:58:16 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> Message-ID: <0bb7728ac479dfcd386a79e8a0258288@earthlink.net> On Jul 3, 2005, at 7:46 AM, Alexander Griekspoor wrote: > I hope you are still hanging in there. Funny, I received this before part I :) > The singleton system remains almost unaltered, I think it works really > nice and has the properties that we want to present to the outside > world right? Actually it is a combination of the singleton and flyweight patterns. But yes, I agree we should keep that in place. > > - What about symbolsets and the creation of new sequences? Symbolsets are in my opinion very useful. Not only as an indicator of what type of sequence one is dealing with, but also as a datafilter. > - The class cluster design of sequence objects. Is it still necessary, > is it correct, flexible? As said, phil and I had problems with it, and > phil was not sure if it was implemented fully correct, but he'll chime > in I guess. Purely from complexity towards novel users (and people > like me who had to get back in again ;-), I would like to see it > disappear if possible. Unless we hide the complete implementation, and only make BCSequence headers private, until now it is indeed confusing. For instance I am still switching between BCSequence and BCAbstrctSequence to see where a method can be found. > > - Do we need subclasses for certain sequence types like proteins, > dna, rna? I guess yes. Do we need a class cluster for this or does > simple subclassing do? You know my standpoint on this one :) > > - How to handle the main char array? An NSData object like charles > suggested, or do-it-your-self management? If we use NSData, than for each operation we need to get and set the data, so two extra steps, correct? Since we are using ObjectiveC we can directly use the c-string. Although that would indeed bring the complexity of malloc, string pointers, etc. It's good I didn't throw away my Simple C beginners guide book! > - How accessible is the c-array from the outside? Read only? It must > be readable at least so for instance an alignment object can use the > string for its work (? la nsstring's getCString method). I would not > encourage direct changing of the char array from within another > object, for instance to prevent out-of-sync problems with annotation > positions. For that we have things like -insertSequence:atPosition: > etc Agree. > Thanks for keeping BioCocoa going Koen, great job! Again, I'm sorry to > not have jumped in earlier, and I hope you don't have the feeling to > be let outside of everything, all the above is not a definitive thing > at all. Looking forward to everyone's reaction! Not left outside, but a little bit shocked, I was actually :) But these new developments certainly seem to be a good step forwards. Hopefully this will be the final re-design! cheers, - Koen. ps any chance you can post/mail the wwdc presentation? From charles.parnot at gmail.com Mon Jul 4 01:27:52 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 3 Jul 2005 22:27:52 -0700 Subject: [Biococoa-dev] About BCSequence (again ;-) Message-ID: I mostly agree with all the discussions and ideas. Of course, there is one point I want to make clear again: the BCSequence thing! And I will jump right to it!!! I made a diagram to illustrate my points: -------------- next part -------------- A non-text attachment was scrubbed... Name: Picture 1.png Type: image/png Size: 22559 bytes Desc: not available URL: -------------- next part -------------- Sorry, I renamed BCSequenceAbstract into BCSequenceRoot, because I just thought of that name and it sounds better... First, there is no such thing as a class cluster in the current implementation. As Phil points out, all the subclasses are exposed. The BCSequence indeed uses a trick similar to class cluster, it is a placeholder class. The implementation is thus the confusing part. It is not very elaborate, though, and once you get past the point where the alloc-ed object can still change class at init, you have it all ;-) Second, the basic idea of having BCSequence along the other objects is to provide two different types of classes: typed and not typed. Historically, the group was divided about 50/50 between the two options. I was on the side of having a one-for-all class that can be blindly sent any message and do something. I know Koen is also in favor of a unique sequence object that include all the possible types. In the end, we decided to keep both typed sequences and the one-for-all BCSequence. With a diagram like above, it is fairly easy to show the use of the different objects, no? Finally, I just want to add that BCSequence does not require additional code and will probably not even need rewriting when we change the internals of the sequence classes, and should not require much in the future. In fact, I volunteer to maintain that code if we keep it! Not writing any code is something I can easily promise ;-) cheers all! charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 01:48:18 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 3 Jul 2005 22:48:18 -0700 Subject: [Biococoa-dev] (no subject) Message-ID: <54DDCD35-27DC-4729-BE33-8D65885F0907@gmail.com> > There is a structural failure in our implementation, the user > thinks he will get a BCSequence object, when he calls a init or > convenient method of BCSequence, but he gets a BCAbstractSequence. > So we have to fix the inheritance model. Hi Phil, Can you clarify this? I don't understand what you mean :-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Mon Jul 4 04:01:47 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Mon, 4 Jul 2005 10:01:47 +0200 Subject: [Biococoa-dev] (no subject) In-Reply-To: <54DDCD35-27DC-4729-BE33-8D65885F0907@gmail.com> References: <54DDCD35-27DC-4729-BE33-8D65885F0907@gmail.com> Message-ID: <89C9F041-BE6B-4D47-93AB-1930ECF285A8@bioworxx.com> Am 04.07.2005 um 07:48 schrieb Charles Parnot: >> There is a structural failure in our implementation, the user >> thinks he will get a BCSequence object, when he calls a init or >> convenient method of BCSequence, but he gets a BCAbstractSequence. >> So we have to fix the inheritance model. > > Hi Phil, > Can you clarify this? I don't understand what you mean :-) > Hey Charles, Sure. We have to make one class out of BCSequence and BCAbstractSequence, i would suggest to name it BCSequence. And now to the big "WHY??": The problem is, that a user writes in his code: BCSequence *seq = [[BCSequence alloc] initWithString:anySeqString]; seq will be for example of type BCSequenceRNA, but BCSequenceRNA doesn't inherit from BCSequence. Objective-C allows our current implementation, but it is a structural failure. We need to subclass the BCSequenceNucleotide etc. from BCSequence and put the code of BCAbstractSequence in BCSequence. It's hard to explain, the easiest thing is to look at the NSValue & NSNumber implementation of GNUStep. If you looked at the implementation and then read my message again ;-) you will hopefully understand. i hope that helps :-), although i'm pretty sure it is confusing Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Mon Jul 4 04:50:46 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 4 Jul 2005 10:50:46 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <722090db0a17da57d2a07a5550c40dfd@earthlink.net> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> <722090db0a17da57d2a07a5550c40dfd@earthlink.net> Message-ID: <307A5FA4-2147-472A-89D4-61A80BC7BFEB@mekentosj.com> About the IO discussion. If we want a separate IO framework(s), I would suggest to keep all IO in one framework, the examples you give are all sequence related, so would depend only on the core framework (BCFoundation) right. If say the HMM framework would need some kind of special structure, than we can always opt to either define the structure in the foundation, but put all other stuff in the HMM framework, or place the HMM specific IO in the HMM framework as it is probably consists of only a few methods anyway. But my main question is whether we really want to separate the IO framework as being independent. BCFoundation is so much related with IO that I don't really see a need to separate them. Most important, I don't think the IO will be so much code that it will be a large framework, or that it will make the foundation so bloated. It's two files right now! So why don't we make BCFoundation contain the IO like it does now. And the HMM framework contains its IO etc. IMHO a basic core that defines sequences and can IO them doesn't have to be separated in two, I see it as one core. Similar to the fact that NSString can also IO them without the need of an IO framework. Cheers, Alex On 4-jul-2005, at 1:20, Koen van der Drift wrote: > > On Jul 3, 2005, at 3:12 PM, Philipp Seibel wrote: > > >> Sorry, but i got lost :-). I just meant that we can't put all io >> of all Frameworks in one IOFramework, because this IOFramework >> will then depend on all Frameworks ( Foundation ,HMM, >> Phylogenetics, etc.) because it will use the structure classes of >> all of them. To solve this problem, we will need a seperate >> IOFramework for each of the Frameworks. >> >> > > Aha, but now I understand :) I didn't realize that you were > assuming different data structures for each data format. Maybe we > should put some effort in creating a general internal data format > that will also allow easy exchange between the various formats. > > cheers, > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com LabAssistant - Get your life organized! http://www.mekentosj.com/labassistant ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Mon Jul 4 05:12:46 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 4 Jul 2005 11:12:46 +0200 Subject: [Biococoa-dev] About BCSequence (again ;-) In-Reply-To: References: Message-ID: <4E0ADBA1-FA84-4F3B-9D3D-F7C24150BC0E@mekentosj.com> See, that's what I mean, the diagram doesn't make sense to me already! What's the difference between bcsequence and bcsequencenucleotide. The implementation is even harder to get. Let's not throw everything away, but I would like to see how practically the one-for-all bcsequence would work. In addition, I have the feeling (but might be terribly wrong) that annotations are way simpler to implement if there's only one kind of sequence object. If I understand it right, a BCSequence object than becomes "typed" on the basis of the BCSequenceSet it has been assigned right? Koen or Charles, could you do a proposal on what the BC world would look like if we choose this model of the general BCSequence object. Could you perhaps speculate/describe: 1 how you create such objects 2 what role sequence sets play in this 3 how one would be able to get its "type" 4 how you would do a simple reverse and what would be returned (for DNA and Protein) 5 how you would do a translation from DNA to Protein 6 how annotations would fit it? Pfeww, all our problems solved ;-) Just kidding, but if we can agree that these problems can be tackled nicely with a single BCSequence object in a way logical to ourselves and new users, IMHO I think we have our design... Cheers, Alex On 4-jul-2005, at 7:27, Charles Parnot wrote: > I mostly agree with all the discussions and ideas. > > Of course, there is one point I want to make clear again: the > BCSequence thing! And I will jump right to it!!! > > I made a diagram to illustrate my points: > > > > Sorry, I renamed BCSequenceAbstract into BCSequenceRoot, because I > just thought of that name and it sounds better... > > First, there is no such thing as a class cluster in the current > implementation. As Phil points out, all the subclasses are exposed. > The BCSequence indeed uses a trick similar to class cluster, it is > a placeholder class. The implementation is thus the confusing part. > It is not very elaborate, though, and once you get past the point > where the alloc-ed object can still change class at init, you have > it all ;-) > > Second, the basic idea of having BCSequence along the other objects > is to provide two different types of classes: typed and not typed. > Historically, the group was divided about 50/50 between the two > options. I was on the side of having a one-for-all class that can > be blindly sent any message and do something. I know Koen is also > in favor of a unique sequence object that include all the possible > types. In the end, we decided to keep both typed sequences and the > one-for-all BCSequence. With a diagram like above, it is fairly > easy to show the use of the different objects, no? > > Finally, I just want to add that BCSequence does not require > additional code and will probably not even need rewriting when we > change the internals of the sequence classes, and should not > require much in the future. In fact, I volunteer to maintain that > code if we keep it! Not writing any code is something I can easily > promise ;-) > > cheers all! > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com iRNAi, do you? http://www.mekentosj.com/irnai ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Mon Jul 4 05:13:50 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 4 Jul 2005 11:13:50 +0200 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: <0bb7728ac479dfcd386a79e8a0258288@earthlink.net> References: <9731795D-D751-4807-811F-28E2182A7B60@gmail.com> <0bb7728ac479dfcd386a79e8a0258288@earthlink.net> Message-ID: >> Thanks for keeping BioCocoa going Koen, great job! Again, I'm >> sorry to not have jumped in earlier, and I hope you don't have the >> feeling to be let outside of everything, all the above is not a >> definitive thing at all. Looking forward to everyone's reaction! >> > > Not left outside, but a little bit shocked, I was actually :) But > these new developments certainly seem to be a good step forwards. > Hopefully this will be the final re-design! Well, I hope it will be, but it would be foolish to say that it is ;-) At least we all agree it makes sense. Which brings me to one more discussion. Unfortunately, that definitely requires John to be there (John are you listening? ;-) We had the discussion a few times and I again think it's time to revive it once more: The typed vs untyped sequence. The reason for the hybrid class cluster / superclass structure we have now. I was always in favor of the latter, but I did see that this way it's to confusing to us and certainly to new users. I've said that I would like to see it disappear. So, if that means that I have to adjust my opinion, and also that I have to do more error checking in my code instead of the framework, then that is what I have to do. Again, I want to see John's opinion on this first, but in return for simplicity shall we then indeed implement things first using the one BCSequence type does it all method and see how well it works? What do you guys think? Alex > > ps any chance you can post/mail the wwdc presentation? Peter, can you arrange that? I only have the two halves, I can put them on our website if you send it to me... ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com The requirements said: Windows 2000 or better. So I got a Macintosh. ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.schols at bio.kuleuven.be Mon Jul 4 05:23:13 2005 From: peter.schols at bio.kuleuven.be (Peter Schols) Date: Mon, 4 Jul 2005 11:23:13 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <307A5FA4-2147-472A-89D4-61A80BC7BFEB@mekentosj.com> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> <722090db0a17da57d2a07a5550c40dfd@earthlink.net> <307A5FA4-2147-472A-89D4-61A80BC7BFEB@mekentosj.com> Message-ID: <37FE4498-9E8D-4711-AAE5-674679C5ADEE@bio.kuleuven.be> I fully agree with Alex. It seems most logical to me if IO would be part of BCFoundation. peter On 04 Jul 2005, at 10:50, Alexander Griekspoor wrote: > About the IO discussion. > If we want a separate IO framework(s), I would suggest to keep all > IO in one framework, the examples you give are all sequence > related, so would depend only on the core framework (BCFoundation) > right. If say the HMM framework would need some kind of special > structure, than we can always opt to either define the structure in > the foundation, but put all other stuff in the HMM framework, or > place the HMM specific IO in the HMM framework as it is probably > consists of only a few methods anyway. > But my main question is whether we really want to separate the IO > framework as being independent. BCFoundation is so much related > with IO that I don't really see a need to separate them. Most > important, I don't think the IO will be so much code that it will > be a large framework, or that it will make the foundation so > bloated. It's two files right now! So why don't we make > BCFoundation contain the IO like it does now. And the HMM framework > contains its IO etc. > IMHO a basic core that defines sequences and can IO them doesn't > have to be separated in two, I see it as one core. Similar to the > fact that NSString can also IO them without the need of an IO > framework. > Cheers, > Alex > > > > On 4-jul-2005, at 1:20, Koen van der Drift wrote: > >> >> On Jul 3, 2005, at 3:12 PM, Philipp Seibel wrote: >> >> >>> Sorry, but i got lost :-). I just meant that we can't put all io >>> of all Frameworks in one IOFramework, because this IOFramework >>> will then depend on all Frameworks ( Foundation ,HMM, >>> Phylogenetics, etc.) because it will use the structure classes of >>> all of them. To solve this problem, we will need a seperate >>> IOFramework for each of the Frameworks. >>> >>> >> >> Aha, but now I understand :) I didn't realize that you were >> assuming different data structures for each data format. Maybe we >> should put some effort in creating a general internal data format >> that will also allow easy exchange between the various formats. >> >> cheers, >> >> - Koen. >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> > > ********************************************************* > ** Alexander Griekspoor ** > ********************************************************* > The Netherlands Cancer Institute > Department of Tumorbiology (H4) > Plesmanlaan 121, 1066 CX, Amsterdam > Tel: + 31 20 - 512 2023 > Fax: + 31 20 - 512 2029 > AIM: mekentosj at mac.com > E-mail: a.griekspoor at nki.nl > Web: http://www.mekentosj.com > > LabAssistant - Get your life organized! > http://www.mekentosj.com/labassistant > > ********************************************************* > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > From biococoa at bioworxx.com Mon Jul 4 05:36:06 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Mon, 4 Jul 2005 11:36:06 +0200 Subject: [Biococoa-dev] New Structure for BioCocoa In-Reply-To: <37FE4498-9E8D-4711-AAE5-674679C5ADEE@bio.kuleuven.be> References: <77E081F9-D1E6-442A-88DC-501E723510FE@bioworxx.com> <620627f2298a3166e5073f263011895d@earthlink.net> <5494581C-5B85-48C1-A75A-8D96F21BBA3F@bioworxx.com> <06bb29ed89ff0b3d754dff0598b45d48@earthlink.net> <4E3322DA-8FC0-4873-87FC-66F16F979EBE@bioworxx.com> <23098b19f75b3cded02936047eb24af7@earthlink.net> <79E869D6-B74D-4190-B31F-1711D88990A2@bioworxx.com> <722090db0a17da57d2a07a5550c40dfd@earthlink.net> <307A5FA4-2147-472A-89D4-61A80BC7BFEB@mekentosj.com> <37FE4498-9E8D-4711-AAE5-674679C5ADEE@bio.kuleuven.be> Message-ID: Ok this will lead us to the structure of the diagram i sent arround, right ? Or do you want to put the BCParser.framework also into the BCFoundation ? I think it would be better to have the BCParser.framework separate, because one could use it to read the files in other structures than the bcfoundation structures. This will of course cause a dependancy between bcfoundation and bcparser framework. .... the next one please ;-) Phil Am 04.07.2005 um 11:23 schrieb Peter Schols: > I fully agree with Alex. It seems most logical to me if IO would be > part of BCFoundation. > > peter > > > On 04 Jul 2005, at 10:50, Alexander Griekspoor wrote: > > >> About the IO discussion. >> If we want a separate IO framework(s), I would suggest to keep all >> IO in one framework, the examples you give are all sequence >> related, so would depend only on the core framework (BCFoundation) >> right. If say the HMM framework would need some kind of special >> structure, than we can always opt to either define the structure >> in the foundation, but put all other stuff in the HMM framework, >> or place the HMM specific IO in the HMM framework as it is >> probably consists of only a few methods anyway. >> But my main question is whether we really want to separate the IO >> framework as being independent. BCFoundation is so much related >> with IO that I don't really see a need to separate them. Most >> important, I don't think the IO will be so much code that it will >> be a large framework, or that it will make the foundation so >> bloated. It's two files right now! So why don't we make >> BCFoundation contain the IO like it does now. And the HMM >> framework contains its IO etc. >> IMHO a basic core that defines sequences and can IO them doesn't >> have to be separated in two, I see it as one core. Similar to the >> fact that NSString can also IO them without the need of an IO >> framework. >> Cheers, >> Alex >> >> >> >> On 4-jul-2005, at 1:20, Koen van der Drift wrote: >> >> >>> >>> On Jul 3, 2005, at 3:12 PM, Philipp Seibel wrote: >>> >>> >>> >>>> Sorry, but i got lost :-). I just meant that we can't put all io >>>> of all Frameworks in one IOFramework, because this IOFramework >>>> will then depend on all Frameworks ( Foundation ,HMM, >>>> Phylogenetics, etc.) because it will use the structure classes >>>> of all of them. To solve this problem, we will need a seperate >>>> IOFramework for each of the Frameworks. >>>> >>>> >>>> >>> >>> Aha, but now I understand :) I didn't realize that you were >>> assuming different data structures for each data format. Maybe >>> we should put some effort in creating a general internal data >>> format that will also allow easy exchange between the various >>> formats. >>> >>> cheers, >>> >>> - Koen. >>> >>> _______________________________________________ >>> Biococoa-dev mailing list >>> Biococoa-dev at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/biococoa-dev >>> >>> >>> >> >> ********************************************************* >> ** Alexander Griekspoor ** >> ********************************************************* >> The Netherlands Cancer Institute >> Department of Tumorbiology (H4) >> Plesmanlaan 121, 1066 CX, Amsterdam >> Tel: + 31 20 - 512 2023 >> Fax: + 31 20 - 512 2029 >> AIM: mekentosj at mac.com >> E-mail: a.griekspoor at nki.nl >> Web: http://www.mekentosj.com >> >> LabAssistant - Get your life organized! >> http://www.mekentosj.com/labassistant >> >> ********************************************************* >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > From jtimmer at bellatlantic.net Mon Jul 4 13:26:10 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 04 Jul 2005 13:26:10 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: Message-ID: >> > > Well, I hope it will be, but it would be foolish to say that it is ;-) At > least we all agree it makes sense. Which brings me to one more discussion. > Unfortunately, that definitely requires John to be there (John are you > listening? ;-) We had the discussion a few times and I again think it's time > to revive it once more: The typed vs untyped sequence. The reason for the > hybrid class cluster / superclass structure we have now. I was always in favor > of the latter, but I did see that this way it's to confusing to us and > certainly to new users. I've said that I would like to see it disappear. So, > if that means that I have to adjust my opinion, and also that I have to do > more error checking in my code instead of the framework, then that is what I > have to do. Again, I want to see John's opinion on this first, but? in return > for simplicity shall we then indeed implement things first using the one > BCSequence type does it all method and see how well it works? What do you guys > think? Wow, just got back from a couple of days relaxing away from the city, and I see things have been very busy. Anyway, as I never planned on taking advantage of the class cluster features, I have to admit I hadn?t realized there were problems with it. I too can see how it might be confusing to new users, but I don?t see the subclass design as confusing in the same way, so I think the class cluster issue and the subclass issue are separate ones. Given that we?re doing a re-design, I think it would be a good time to restate what the issues are regarding the single vs. typed class design as they currently stand. So, Koen, could you state what your current thoughts are regarding what?s wrong with the current subclass structure, and what advantages a single class would bring. My current reasons for wanting typed subclasses (I can probably think of more, given the time): Error issues: Allows users to catch errors at compile time, rather than having errors pop up while running. Corresponding reduction in the amount of code required to catch or prevent errors. We?d have to make many decisions regarding what are appropriate return values for nonsensical method calls. (These three are specific to issues like asking a protein for its melting point, or asking RNA about its hydrophobicity ? the unexpected return values can potentially crash a program.) Object oriented design issues: Without typing, methods (both convenience and complete methods) wind up grouped with data they can?t operate sensibly on. Subclassing allows specific types of data/methods to be grouped by the type of sequence they pertain to. Without type, the objects become very ?stupid?, performing little more than what an NSArray already does. For generic sequence operations, users can still use a generic class ? the superclass. In the end, we all agree that a sequence has to be typed on some level, so the user can tell what sequence type they have. I don?t see why doing it at the class level, which allows much better organization and object oriented design, should be viewed as a problem. JT -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Mon Jul 4 14:11:40 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 4 Jul 2005 14:11:40 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: Message-ID: On Jul 4, 2005, at 1:26 PM, John Timmer wrote: > Wow, just got back from a couple of days relaxing away from the city, > and I see things have been very busy. ?Anyway, as I never planned on > taking advantage of the class cluster features I think that is actually one of the problems with the design. The way that I thought it should be used is that we only should use BCSequence, and the internal code would figure out what kind of sequence we are dealing with. However you seem to be using the subclasses directly. I am not saying that one approach is better than the other, but if it is even confusing or unclear for the BioCocoa developers, I can imagine how confusing it could be for users of the framework! > So, Koen, could you state what your current thoughts are regarding > what?s wrong with the current subclass structure, and what advantages > a single class would bring. For the record, I didn't initiate the current discussion about the design of the BCSequence classes. I think it was a conspiracy from the guys who went to wwdc :) Anyway, my main gripe with the subclasses was that when we first started implementing the code, we were duplicating a lot of code over all subclasses. When using OOP, in such cases it is very beneficial to move all the similar code to a common superclass. And in fact most of the code was similar and it turned out that only in a few cases this is not possible, exactly for the cases that you describe. However, by using symbolsets (aka alphabets) I think we can have some internal typing that can avoid these problems. As I said before this is also the approach of our siblings bioperl, biopython and biojava. For instance, biojava uses sequence specific classes that are used to perform operations, eg (in javacode): Sequence dna = DNATools.createDNASequence("atgctg", "dna_1"); or: IsoelectricPointCalc ic = new IsoelectricPointCalc(); pI = ic.getPI(protein, true, true); The code throws a exception when the wrong type of sequence is used. Indeed not an ideal situation, since as you stated correctly, it might crash the program. And to be honest, I have currently no solution to prevent that. cheers, - Koen. From jtimmer at bellatlantic.net Mon Jul 4 15:32:33 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 04 Jul 2005 15:32:33 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: Message-ID: > but if it is > even confusing or unclear for the BioCocoa developers, I can imagine > how confusing it could be for users of the framework! I wasn't saying that I was confused, just that (for the reasons my post detailed), I was going to use the subclasses directly no matter what, so I didn't pay careful attention. Of course, if I had paid attention, perhaps I would have been confused.... > For the record, I didn't initiate the current discussion about the > design of the BCSequence classes. I think it was a conspiracy from the > guys who went to wwdc :) I wasn't accusing! I just know that you seem to have felt the strongest about this, so I asked for your opinion specifically. > Anyway, my main gripe with the subclasses was that when we first > started implementing the code, we were duplicating a lot of code over > all subclasses. Yes, that situation was my fault - it took me forever to go back and clean up the nucleotide subclasses. I'd get so much more done in BioCocoa if I didn't have to keep doing research! Anyway, it's fixed now, so I hope you gripe less. > IsoelectricPointCalc ic = new IsoelectricPointCalc(); > pI = ic.getPI(protein, true, true); > > The code throws a exception when the wrong type of sequence is used. > Indeed not an ideal situation, since as you stated correctly, it might > crash the program. And to be honest, I have currently no solution to > prevent that. There's two other problems with that approach - one is that ObjC has two different types of exception mechanisms, and the better of them isn't available on 10.2, which is what we're targeting right now. The other is a general philosophical one: some people feel strongly that you should never code planning to throw an exception - they should be reserved for bad situations that are beyond your control (damaged bundle, network failure, etc.). We may, already or at some point in the future, wind up with one of those people developing with us, which would mean they object to that sort of method on principle. Which means more arguments. Anyway, does anyone know why the BioOther groups chose a single class for sequences? And how does that affect what functions are performed by sequences themselves? I'm just wondering if there was an actual discussion of design options along the lines of what we're having when those were set up. JT From kvddrift at earthlink.net Mon Jul 4 16:00:15 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 4 Jul 2005 16:00:15 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: Message-ID: <85a35917227d1be3aca7459a338b0457@earthlink.net> On Jul 4, 2005, at 3:32 PM, John Timmer wrote: > Anyway, does anyone know why the BioOther groups chose a single class > for > sequences? And how does that affect what functions are performed by > sequences themselves? I'm just wondering if there was an actual > discussion > of design options along the lines of what we're having when those were > set > up. > I actually wrote an email to one of the biojava folks earlier today. Their list archives didn't reveal much. Let's see what he replies :) - Koen. From charles.parnot at gmail.com Mon Jul 4 16:03:32 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 13:03:32 -0700 Subject: [Biococoa-dev] (no subject) In-Reply-To: <89C9F041-BE6B-4D47-93AB-1930ECF285A8@bioworxx.com> References: <54DDCD35-27DC-4729-BE33-8D65885F0907@gmail.com> <89C9F041-BE6B-4D47-93AB-1930ECF285A8@bioworxx.com> Message-ID: <55E06CA6-5CC5-4C58-95AA-0CE04165F489@gmail.com> On Jul 4, 2005, at 1:01 AM, Philipp Seibel wrote: > > Am 04.07.2005 um 07:48 schrieb Charles Parnot: > >>> There is a structural failure in our implementation, the user >>> thinks he will get a BCSequence object, when he calls a init or >>> convenient method of BCSequence, but he gets a >>> BCAbstractSequence. So we have to fix the inheritance model. >> >> Hi Phil, >> Can you clarify this? I don't understand what you mean :-) >> > > Hey Charles, > > Sure. We have to make one class out of BCSequence and > BCAbstractSequence, i would suggest to name it BCSequence. > And now to the big "WHY??": > > The problem is, that a user writes in his code: > > BCSequence *seq = [[BCSequence alloc] initWithString:anySeqString]; > > seq will be for example of type BCSequenceRNA, but BCSequenceRNA > doesn't inherit from BCSequence. > Objective-C allows our current implementation, but it is a > structural failure. We need to subclass the BCSequenceNucleotide > etc. from BCSequence and put the code of BCAbstractSequence in > BCSequence. I don't see why BCSequenceRNA *has* to inherit from BCSequence. Like you say, the implementation works fine and is transparent to the user. You choose to use a BCSequence object that can respond to all messages and be any type of sequence. Or you choose to work with a typed class, e.g. BCSequenceRNA, that only responds to nucleotide- specific messages (at least from the comppiler point of view). > It's hard to explain, the easiest thing is to look at the NSValue & > NSNumber implementation of GNUStep. > If you looked at the implementation and then read my message > again ;-) you will hopefully understand. I think there is a misunderstanding, then! The BCSequence is *not* a class cluster. NSValue and NSNumber are class clusters. I believe you will have more questions following that, but I will just wait for them instead of trying to address everything now ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 16:32:31 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 13:32:31 -0700 Subject: [Biococoa-dev] About BCSequence (again ;-) Message-ID: <8A063BFA-F4F8-4D86-830B-07D15172854E@gmail.com> On Jul 4, 2005, at 2:12 AM, Alexander Griekspoor wrote: > See, that's what I mean, the diagram doesn't make sense to me > already! What's the difference between bcsequence and > bcsequencenucleotide. > BCSequence can be of any type and can respond to any message, even type-specific messages like '-complement' or '-hydrophobicity'. BCSequenceNucleotide can only be DNA and RNA and will only respond to nucleotide-specific messages (from the compiler point of view). charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 16:41:26 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 13:41:26 -0700 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: Message-ID: <5E8A8154-4BF4-4985-8E4E-4AE8DF4E2D4A@gmail.com> On Jul 4, 2005, at 12:32 PM, John Timmer wrote: > > >> but if it is >> even confusing or unclear for the BioCocoa developers, I can imagine >> how confusing it could be for users of the framework! I thought it was clear for all of us, and this is what I tried to articulate in the header for the BCAbstractSequence. You have two options for BCSequence classes: * a bunch of typed sequence classes * a non-typed class BCSequence that can be of any type and respond to any message Once you choose one option, it is probably easier to stick to it. > I wasn't saying that I was confused, just that (for the reasons my > post > detailed), I was going to use the subclasses directly no matter > what, so I > didn't pay careful attention. Of course, if I had paid attention, > perhaps I > would have been confused.... This is exactly the right approach (and I am not kidding!). Pick one option and forget about the other option. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 16:42:48 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 13:42:48 -0700 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: Message-ID: On Jul 4, 2005, at 12:32 PM, John Timmer wrote: > There's two other problems with that approach - one is that ObjC > has two > different types of exception mechanisms, and the better of them isn't > available on 10.2, which is what we're targeting right now. The > other is a > general philosophical one: some people feel strongly that you > should never > code planning to throw an exception - they should be reserved for bad > situations that are beyond your control (damaged bundle, network > failure, > etc.). We may, already or at some point in the future, wind up > with one of > those people developing with us, which would mean they object to > that sort > of method on principle. Which means more arguments. I totally agree. NSException should be only for exceptional cases. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 16:57:45 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 13:57:45 -0700 Subject: [Biococoa-dev] About BCSequence (again ;-) In-Reply-To: <4E0ADBA1-FA84-4F3B-9D3D-F7C24150BC0E@mekentosj.com> References: <4E0ADBA1-FA84-4F3B-9D3D-F7C24150BC0E@mekentosj.com> Message-ID: <8A02BC1D-AF83-4053-8DFC-4EDD3151DF94@gmail.com> On Jul 4, 2005, at 2:12 AM, Alexander Griekspoor wrote: > If I understand it right, a BCSequence object than becomes "typed" > on the basis of the BCSequenceSet it has been assigned right? Koen > or Charles, could you do a proposal on what the BC world would look > like if we choose this model of the general BCSequence object. > Could you perhaps speculate/describe: > > 1 how you create such objects > 2 what role sequence sets play in this > 3 how one would be able to get its "type" > 4 how you would do a simple reverse and what would be returned (for > DNA and Protein) > 5 how you would do a translation from DNA to Protein > 6 how annotations would fit it? Just as an exercise, I will briefly answer these questions: 1. The objects are created by the IO or by passing a string: - the type might be guessed - the type might be passed as argument, in which case the default BCSymbolSet is used (with all the ambiguous symbols) - a symbol set is passed as argument, which restrict the symbols that can be used 2. Symbol sets (I think that is what you meant?) are used to filter the sequence symbols. They are also used to translate from char to BCSymbol, as a char meaning is dependent on the symbol set, e.g. can be an alanine or an adenosine. The symbol set is the best object to ask. For performance, we might end up having a char/BCSymbol array as one of the ivar. 3. The type is set by an ivar. The BCSymbolSet also has a type and has to be the same. This is the current implementation. 4. To get the reverse, you take all the chars and put them backwards. Ah! OK, I guess you meant 'complement'. If not DNA, return self. Otherwise, you get the array of symbols, and ask each symbol its complement, and then create the array of chars. 5. The current code is fine. I would however make BCCodon a subclass of BCSymbol for consistency. 6. Annotations are independent of sequence type. OK, we are done, let's call it version 1.0 ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 17:18:47 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 14:18:47 -0700 Subject: [Biococoa-dev] -test-test-test-...sorry Message-ID: Sorry, I changed my smtp server, not using the gmail smtp, to see if I can get my own messages this way... -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 17:29:39 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 14:29:39 -0700 Subject: [Biococoa-dev] Typed or non-typed... or both Message-ID: <77B52F86-1C2A-4776-BC11-137AE26B09C4@gmail.com> Here are my latest thoughts about the typed vs non-typed discussion. We have 3 options: * only typed sequence classes * only one untyped sequence class * both at the same time I thought we could go with option 3 for a while before we decide (that was the latest consensus), but if it is too confusing now and it stops further development, then we should just drop that option. Because several of you really want typed classes, and you are the one most likely to use the framework, it would be suicidal to not keep the typed sequence class and already frustrate the potential users of the framework (and even worse, the developers!). YThe conclusion is quite obvious to me ;-) charles NB: for the sake of discussion, I will of course continue to posts a few examples of where I think a sinlge BCSequence interface is useful NB2: my smtp server trick worked! heeiyaa! -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Mon Jul 4 17:31:26 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 04 Jul 2005 17:31:26 -0400 Subject: [Biococoa-dev] About BCSequence (again ;-) In-Reply-To: <8A02BC1D-AF83-4053-8DFC-4EDD3151DF94@gmail.com> Message-ID: > 5. The current code is fine. I would however make BCCodon a subclass > of BCSymbol for consistency. We thought about this in the past - codons are very tricky, because they have both an amino acid component and a sequence component. They don't fit very conveniently into the Symbol model. I think it might be easier to remove the BCCodonSequence class from the sequence inheritance - it's mostly there to prevent duplication of code. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Mon Jul 4 17:35:05 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 4 Jul 2005 17:35:05 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: <5E8A8154-4BF4-4985-8E4E-4AE8DF4E2D4A@gmail.com> References: <5E8A8154-4BF4-4985-8E4E-4AE8DF4E2D4A@gmail.com> Message-ID: On Jul 4, 2005, at 4:41 PM, Charles Parnot wrote: > This is exactly the right approach (and I am not kidding!). Pick one > option and forget about the other option. > But right now we are using both options within BioCocoa - which is what I refered to as the confusion. - Koen. From charles.parnot at gmail.com Mon Jul 4 17:39:16 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 14:39:16 -0700 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: References: <5E8A8154-4BF4-4985-8E4E-4AE8DF4E2D4A@gmail.com> Message-ID: <0A38858D-BFD8-4F7A-A93E-62AE8D2112C5@gmail.com> >> This is exactly the right approach (and I am not kidding!). Pick >> one option and forget about the other option. >> >> > > But right now we are using both options within BioCocoa - which is > what I refered to as the confusion. > > - Koen. OK, got it! Where? In the examples? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 4 17:55:50 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 4 Jul 2005 14:55:50 -0700 Subject: [Biococoa-dev] Using an untyped class sequence Message-ID: <453BD6BD-763A-4089-A703-2301D702CE94@gmail.com> Here is one situation where an untyped sequence class can be simpler to use. You have a BCSequence displayed in a nice BCSequenceView. The user chooses 'Reverse Complement' in the toolbar of the sequence window (the delegate is thus first responder and is self below). You want to create a new sequence from it. - (IBAction)reverseComplement:(id)sender { BCSequence *initialSequence = [self sequence]; [MyDocument newSequenceWindowWithSequence:[initialSequence reverseComplement]]; } Now, I tried to think of how to implement that with typed sequences, though the overall architecture of the app might end up being quite different. - (IBAction)reverseComplement:(id)sender { BCSequence *initialSequence = [self selectedSequence]; BCSequence *newSequence; if ( [originalSequence type] == BCSequenceDNA ) newSequence = [(BCDNASequence *)initialSequence reverseComplement]; //cast to avoid compiler warnings else if ( [originalSequence type] == BCSequenceDNA ) newSequence = [(BCDNASequence *)initialSequence reverseComplement]; //cast to avoid compiler warnings else return; [MyDocument newSequenceWindowWithSequence:newSequence]; } I know it is a contrived example, but I am just thinking a general sequence view program ?-la DNAStrider. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Mon Jul 4 18:02:09 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 04 Jul 2005 18:02:09 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <453BD6BD-763A-4089-A703-2301D702CE94@gmail.com> Message-ID: Ah, but good user interface would dictate that the menu item enabling method should check the resident sequence type and disable non-functional menu items, such as reverse complement, when a protein is loaded ;). > Here is one situation where an untyped sequence class can be simpler > to use. You have a BCSequence displayed in a nice BCSequenceView. The > user chooses 'Reverse Complement' in the toolbar of the sequence > window (the delegate is thus first responder and is self below). You > want to create a new sequence from it. > > - (IBAction)reverseComplement:(id)sender > { > BCSequence *initialSequence = [self sequence]; > [MyDocument newSequenceWindowWithSequence:[initialSequence > reverseComplement]]; > } > > > Now, I tried to think of how to implement that with typed sequences, > though the overall architecture of the app might end up being quite > different. > > - (IBAction)reverseComplement:(id)sender > { > BCSequence *initialSequence = [self selectedSequence]; > BCSequence *newSequence; > if ( [originalSequence type] == BCSequenceDNA ) > newSequence = [(BCDNASequence *)initialSequence > reverseComplement]; //cast to avoid compiler warnings > else if ( [originalSequence type] == BCSequenceDNA ) > newSequence = [(BCDNASequence *)initialSequence > reverseComplement]; //cast to avoid compiler warnings > else > return; > [MyDocument newSequenceWindowWithSequence:newSequence]; > } > > I know it is a contrived example, but I am just thinking a general > sequence view program ?-la DNAStrider. > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Mon Jul 4 19:44:24 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 4 Jul 2005 19:44:24 -0400 Subject: Subject: [Biococoa-dev] New Structure for BioCocoa part II In-Reply-To: <0A38858D-BFD8-4F7A-A93E-62AE8D2112C5@gmail.com> References: <5E8A8154-4BF4-4985-8E4E-4AE8DF4E2D4A@gmail.com> <0A38858D-BFD8-4F7A-A93E-62AE8D2112C5@gmail.com> Message-ID: On Jul 4, 2005, at 5:39 PM, Charles Parnot wrote: > > OK, got it! Where? In the examples? > > In the examples I think I already replaced everything with BCSequence. But eg in BCCodon and BCGeneticCode there are still a lot of typed sequences. - Koen. From kvddrift at earthlink.net Mon Jul 4 20:35:09 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 4 Jul 2005 20:35:09 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: On Jul 4, 2005, at 6:02 PM, John Timmer wrote: > Ah, but good user interface would dictate that the menu item enabling > method > should check the resident sequence type and disable non-functional menu > items, such as reverse complement, when a protein is loaded ;). > Which to me is the responsibility of the app-developer, not of BioCocoa. And even with a singly-typed sequence, we still have the symbolset to check for and adjust the GUI accordingly. - Koen. From mek at mekentosj.com Tue Jul 5 03:15:41 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Tue, 5 Jul 2005 09:15:41 +0200 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: <8DF78C02-4B04-4F94-BAC5-6E02B9738CB4@mekentosj.com> Nice discussion! But indeed the question which hasn't been answered by both pros and contras, can the symbolset provide the error checking and return value customisation to such a level that we don't need the typed sequences. Note that we could provide much of this in the BCSequence methods and I personally think it would be possible, which would give us one unified interface to BCSequence. What do you think? Charles did I now get it right, that you actually are thinking of only the typed versions? What's your viewpoint on this? Cheers, alex On 5-jul-2005, at 2:35, Koen van der Drift wrote: > > On Jul 4, 2005, at 6:02 PM, John Timmer wrote: > > >> Ah, but good user interface would dictate that the menu item >> enabling method >> should check the resident sequence type and disable non-functional >> menu >> items, such as reverse complement, when a protein is loaded ;). >> >> > > Which to me is the responsibility of the app-developer, not of > BioCocoa. And even with a singly-typed sequence, we still have the > symbolset to check for and adjust the GUI accordingly. > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows vs Mac 65 million years ago, there were more dinosaurs than humans. Where are the dinosaurs now? ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Tue Jul 5 06:51:46 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 06:51:46 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <8DF78C02-4B04-4F94-BAC5-6E02B9738CB4@mekentosj.com> References: <8DF78C02-4B04-4F94-BAC5-6E02B9738CB4@mekentosj.com> Message-ID: <99755142b8a302eb08053f526bcc589e@earthlink.net> On Jul 5, 2005, at 3:15 AM, Alexander Griekspoor wrote: > Nice discussion! But indeed the question which hasn't been answered by > both pros and contras, can the symbolset provide the error checking > and return value?customisation to such a level that we don't need the > typed sequences. For the first one: yes. Once you have a sequence, you know based on the symbolset what type of sequence your are dealing with. Maybe you don't even need an additional type ivar. By using tools classes specific for sequence types you should be able to avoid the wrong opereation on a sequence. I am not sure what you mean by 'return value customization'. cheers, - Koen. From jtimmer at bellatlantic.net Tue Jul 5 10:33:29 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Tue, 05 Jul 2005 10:33:29 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <8DF78C02-4B04-4F94-BAC5-6E02B9738CB4@mekentosj.com> Message-ID: >> Which to me is the responsibility of the app-developer, not of BioCocoa. And >> even with a singly-typed sequence, we still have the symbolset to check for >> and adjust the GUI accordingly. Oh, I agree - App developer issue - that's why I tried to make clear it's a joke. > Nice discussion! But indeed the question which hasn't been answered by both > pros and contras, can the symbolset provide the error checking and return > value?customisation to such a level that we don't need the typed sequences. > Note that we could provide much of this in the BCSequence methods and I > personally think it would be possible, which would give us one unified > interface to BCSequence. What do you think? Charles did I now get it right, > that you actually are thinking of only the typed versions? What's your > viewpoint on this? There's no question that we're going to have a sequence type maintained somewhere - the question seems to be where. I have no doubt that, although the code would be structured differently, fairly equivalent functionality could be provided by either typed classes or by asking a sequence object what type of sequence it contained. So, I?m not arguing about either of those aspects. In my earlier email, though, I provided a list of the advantages of typed sequence classes and some disadvantages of the alternative. Koen had objected to typed sequence classes in the past because we had some code duplications, but with that problem now fixed, I?m not sure whether there are any disadvantages to typed classes. I haven?t seen an equivalent list of advantages to typed classes. The only thing that?s clear is that other BioX projects have used a single sequence class, but we don?t currently know their reasoning. So, can anyone sell me on the advantages of a single class? JT _______________________________________________ This mind intentionally left blank -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.parnot at gmail.com Tue Jul 5 12:32:38 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 5 Jul 2005 09:32:38 -0700 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> > > Nice discussion! But indeed the question which hasn't been > answered by both > > pros and contras, can the symbolset provide the error checking > and return > > value customisation to such a level that we don't need the typed > sequences. > > Note that we could provide much of this in the BCSequence methods > and I > > personally think it would be possible, which would give us one > unified > > interface to BCSequence. What do you think? Charles did I now get > it right, > > that you actually are thinking of only the typed versions? What's > your > > viewpoint on this? > > There's no question that we're going to have a sequence type > maintained somewhere - the question seems to be where. I have no > doubt that, although the code would be structured differently, > fairly equivalent functionality could be provided by either typed > classes or by asking a sequence object what type of sequence it > contained. So, I?m not arguing about either of those aspects. > > In my earlier email, though, I provided a list of the advantages of > typed sequence classes and some disadvantages of the alternative. > Koen had objected to typed sequence classes in the past because we > had some code duplications, but with that problem now fixed, I?m > not sure whether there are any disadvantages to typed classes. I > haven?t seen an equivalent list of advantages to typed classes. > The only thing that?s clear is that other BioX projects have used a > single sequence class, but we don?t currently know their reasoning. > > So, can anyone sell me on the advantages of a single class? > > JT This is exactly right, John. We are mixing a bit 3 different aspects of the typed/untyped issue: * the public interface: this is what the user and the compiler sees * the implementation: even with just one public interface, there can still be several private subclasses (that would be in the case of a real class cluster) * the runtime: even with several classes, a lot of the code is anyway run in the superclass I do believe than in terms of implementation, the subclass structure is better than one class. Of course, it is still important to put as much code as possible in the superclass and avoid code duplication, which was a very legitimate concern of Koen (and I tried to address as much as possible when I cleaned the 'init' methods). John is right: this concern has been addressed. If you look at the code now, the subclasses are very light, and only contain specific code. The subclasses are not completely empty (except BCSequenceProtein at this point), and it is a good thing that the specific code goes in a specific class: it separates code in a very natural way. We should stay vigilant and keep it that way. (As an aside: The BCSymbolSet mechanism actually helped in the process of generalizing the code in the superclas I actually liked the whole concept very much, it is quite powerful.) In terms of the public interface, the other main benefit of subclasses is to have compilation check. Of course, this could be achieved even if all the code was in just one class, and one would create dummy classes for the sole purpose of compilation checking and typing. My bottom line is the following: * it seems we should keep the subclass structure no matter what; and because Koen's concern has been addressed, he might agree with that... Koen? ;-) * several of you want compilation checking, which is a legitimate desire, and because these people (John, alex, peter,..?) are REAL users of the framework, it would be stupid to not provide it, and I certainly would not want to argue against it; if the user wants it, it should be there! * except me, it seems everybody is confused by the placeholder class BCSequence; the idea was to try to have both options (BCSequence or the bunch of typed sequences), and decide at some point to dump one of the two or maybe keep both; it seems the consensus is now to dump one of the two, and based on the above, it seems logical to dump BCSequence; this is OK, there is very little code in there anyway, it was not very much work; just please keep it around a little while, I will archive it somewhere on my hard-drive (I don't want to rely only on the CVS server!!). And if we ever want to switch back to a single public class, it would not be very much work; the existing class structure would be easily amenable to a class cluster. So, ready to move on :-) ? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Tue Jul 5 13:43:14 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Tue, 05 Jul 2005 13:43:14 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> Message-ID: > So, ready to move on :-) ? If we are, it would seem to me that the next big step would be to change the internal representation of the sequences to a char array, and then write the bridging code to generate object representations on the fly. I don't trust my malloc capabilities enough to volunteer for the former, but I'll happily help on the latter. One question about design of this: should we think about a caching policy? Maybe have a boolean ivar that allows a sequence to retain its symbol array at an app developer's discretion (defaults to NO, but can be manually set to YES in cases where an app developer knows he'll need repeated access to individual symbols). It would let app developers make performance vs. memory decisions, instead of leaving them at the framework level. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Tue Jul 5 18:27:00 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 18:27:00 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: <25d5fdccec3a0746230155318b29606f@earthlink.net> On Jul 5, 2005, at 10:33 AM, John Timmer wrote: > In my earlier email, though, I provided a list of the advantages of > typed sequence classes and some disadvantages of the alternative. > ?Koen had objected to typed sequence classes in the past because we > had some code duplications, but with that problem now fixed, I?m not > sure whether there are any disadvantages to typed classes. ?I haven?t > seen an equivalent list of advantages to typed classes. ?The only > thing that?s clear is that other BioX projects have used a single > sequence class, but we don?t currently know their reasoning. > > So, can anyone sell me on the advantages of a single class? I got a reply from the BioJava gut. He was not directly answering the question, but I will quote his reply here below: It's a question of encapsulation vs generality. I prefer the biojava approach of having Sequence as general as possible and providing other classes (like tools classes) to do the alphabet specific analysis. Encapsulation has the advantage of having an RNA object know everything it needs to know about itself built in. However... Encapsulation means you would need one type for every type of Alphabet someone might use. BioJava takes the nice approach of letting you use any Alphabet (you can even make an Integer Sequence). Symbols are also very generic in BioJava which is generally good because you can make a Distribution over DNA or RNA or Protein with the same Distribution class. In the early days I argued for encapsulation. I'm now very much in the generic interface design school. Two other tips that will make life easier... Use interbase coordinates Strand is best handled at the level of location not at the level of feature. So I guess the separation of sequence and tools is a key argument. Of course you can also have this with subclasses :) Also the part about one subclass for each Alphabet (our symbolsets) is a valid one. We now have only a few, but soon will have mutable/immutable, strict/non-strict, etc. This will quickly increase the number of subclasses. I have sent him some follow up questions, let's see what answer I get. cheers, - Koen. From charles.parnot at gmail.com Tue Jul 5 18:34:56 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 5 Jul 2005 15:34:56 -0700 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol Message-ID: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> > One question about design of this: should we think about a caching > policy? > Maybe have a boolean ivar that allows a sequence to retain its > symbol array > at an app developer's discretion (defaults to NO, but can be > manually set to > YES in cases where an app developer knows he'll need repeated > access to > individual symbols). It would let app developers make performance vs. > memory decisions, instead of leaving them at the framework level. > > JT > I thought of that too initially, but more recently I thought: * if the sequence is immutable, the developer could cache it outside of the framework, no need to do it in the framework * if the sequence is mutable, caching means a lot of additional code to keep the array in sync for all the methods where the sequence changes; which probably means we should not do it now, but leave it for the future *if* the need arises. Or the developer could be smart and know better than use NSArray when performance is needed ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 5 18:35:44 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 18:35:44 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> Message-ID: <7ad6585fce0fa0704bb033399582ac45@earthlink.net> On Jul 5, 2005, at 12:32 PM, Charles Parnot wrote: > My bottom line is the following: > * it seems we should keep the subclass structure no matter what; and > because Koen's concern has been addressed, he might agree with that... > Koen? ;-) My main reason for liking the single class so much is that it is easy to maintain and allows a high degree of modularity. Indeed the code duplication has been addressed in great extent and I am sure that everyone can agree that this makes maintenance of the sequence classes much easier. However, when adding stuff into the superclass, we always need to be aware of the fact that the code needs to be adapted for one or more subclasses. What I also like a lot is that most functionality is kept is small 'Tools' classes. We already made a start with that and it seems to work nice (at least for me ;-) Now there are (at least) two ways to call the tools classes: 1. We put sequence-specific wrappers in the subclasses (more or less our current approach) 2. We create a general tools class for DNA, RNA, Protein, that contains wrappers to various tools classes, again sequence-specific (biojava approach): BCSequenceTool -> MW, ... BCProteinTool -> pI, digest, ... BCDNATool -> translate, ... BCRNATool -> transcribe, ... So for instance, [BCSequenceTools translation] does not exist, so will never compile. In these cases, call BCDNATools to take care of the operation. Another example, BCProteinTool should only contain wrappers that calls the specific tool., so eg: pI = [BCProteinTool isoElectricPoint forSequence: mySeq] Now if mySeq is not a protein, then display a console message, plus return a reasonable value. If we document this well, there shouldn't be a problem. These things can always happen, also when using typed sequences. I think it is fine if we leave some responsibility with the user of the framework. They don't want their program to misbehave so will be sure to catch these type of errors. > * except me, it seems everybody is confused by the placeholder class > BCSequence; the idea was to try to have both options (BCSequence or > the bunch of typed sequences), and decide at some point to dump one of > the two or maybe keep both; it seems the consensus is now to dump one > of the two, and based on the above, it seems logical to dump > BCSequence; this is OK, there is very little code in there anyway, it > was not very much work; just please keep it around a little while, I > will archive it somewhere on my hard-drive (I don't want to rely only > on the CVS server!!). And if we ever want to switch back to a single > public class, it would not be very much work; the existing class > structure would be easily amenable to a class cluster. So what will we be using instead then? - Koen. From charles.parnot at gmail.com Tue Jul 5 18:36:19 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 5 Jul 2005 15:36:19 -0700 Subject: Fwd: [Biococoa-dev] NSMutableData vs malloc References: Message-ID: On Jul 5, 2005, at 10:43 AM, John Timmer wrote: > > > >> So, ready to move on :-) ? >> >> > > If we are, it would seem to me that the next big step would be to > change the > internal representation of the sequences to a char array, and then > write the > bridging code to generate object representations on the fly. I > don't trust > my malloc capabilities enough to volunteer for the former, but I'll > happily > help on the latter. > This is why I was leaning towards NSData. Actually, the real benefit is with NSMutableData. Mallocing an array of char like NSData would is trivial, but dealing with resizing is more difficult. NSMutableData does it for us and deals with memory issues. In addition, because it is a class cluster (sorry, the c word!), it might be (or will be) optimized for different sizes of data. Also, it is easy to write/read files, but also send data over the network... Finally, it is Core Data compatible. The bottom line is: if we malloc our own char[], we might end up creating NSData object anyway. Getting the pointer to the array of bytes is trivial: [myData bytes] or [myData mutableBytes]. charles NB: sorry, John, I also sent you by mistake a half-baked version of that email! -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 5 18:54:03 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 18:54:03 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: On Jul 5, 2005, at 6:36 PM, Charles Parnot wrote: >> If we are, it would seem to me that the next big step would be to >> change the >> internal representation of the sequences to a char array, and then >> write the >> bridging code to generate object representations on the fly. I don't >> trust >> my malloc capabilities enough to volunteer for the former, but I'll >> happily >> help on the latter. >> > > This is why I was leaning towards NSData. Actually, the real benefit > is with NSMutableData. Mallocing an array of char like NSData would is > trivial, but dealing with resizing is more difficult. NSMutableData > does it for us and deals with memory issues. In addition, because it > is a class cluster (sorry, the c word!), it might be (or will be) > optimized for different sizes of data. Also, it is easy to write/read > files, but also send data over the network... Finally, it is Core Data > compatible. The bottom line is: if we malloc our own char[], we might > end up creating NSData object anyway. Getting the pointer to the array > of bytes is trivial: [myData bytes] or [myData mutableBytes]. > But I guess for the data operations we will still need the nitty gritty C calls, including friendly stringpointers. - Koen. From charles.parnot at gmail.com Tue Jul 5 18:57:45 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 5 Jul 2005 15:57:45 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <63C44698-DD25-4F87-9CF9-AFF5A2C24841@gmail.com> >> >> This is why I was leaning towards NSData. Actually, the real >> benefit is with NSMutableData. Mallocing an array of char like >> NSData would is trivial, but dealing with resizing is more >> difficult. NSMutableData does it for us and deals with memory >> issues. In addition, because it is a class cluster (sorry, the c >> word!), it might be (or will be) optimized for different sizes of >> data. Also, it is easy to write/read files, but also send data >> over the network... Finally, it is Core Data compatible. The >> bottom line is: if we malloc our own char[], we might end up >> creating NSData object anyway. Getting the pointer to the array of >> bytes is trivial: [myData bytes] or [myData mutableBytes]. >> >> > > But I guess for the data operations we will still need the nitty > gritty C calls, including friendly stringpointers. > > - Koen. Yes, but this very friendly, right? e.g. stringSequence[190] This is (almost) as friendly as perl ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 5 19:08:00 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 19:08:00 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <63C44698-DD25-4F87-9CF9-AFF5A2C24841@gmail.com> References: <63C44698-DD25-4F87-9CF9-AFF5A2C24841@gmail.com> Message-ID: On Jul 5, 2005, at 6:57 PM, Charles Parnot wrote: > e.g. stringSequence[190] > > This is (almost) as friendly as perl ;-) > If we only will iterate the strings, then it is probably not a problem, but what about inserting, removing, copying? Should we just use methods like strcpy, strcmp, etc? - Koen. From kvddrift at earthlink.net Tue Jul 5 20:10:21 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 5 Jul 2005 20:10:21 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: <49cdcb07213675c43b2a8155e75310b5@earthlink.net> On Jul 5, 2005, at 1:43 PM, John Timmer wrote: > and then write the > bridging code to generate object representations on the fly. I think we already have that: + (id) symbolForChar: (unichar)aSymbol cheers, - Koen. From charles.parnot at gmail.com Wed Jul 6 01:05:58 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 5 Jul 2005 22:05:58 -0700 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <49cdcb07213675c43b2a8155e75310b5@earthlink.net> References: <49cdcb07213675c43b2a8155e75310b5@earthlink.net> Message-ID: On Jul 5, 2005, at 5:10 PM, Koen van der Drift wrote: > > On Jul 5, 2005, at 1:43 PM, John Timmer wrote: > > > >> and then write the >> bridging code to generate object representations on the fly. >> >> > > I think we already have that: > > + (id) symbolForChar: (unichar)aSymbol > > cheers, > > - Koen. > Exactly! In fact, even better is this method from BCSymbolSet: - (BCSymbol *)symbolForChar:(unichar)aChar This is another nice thing with the symbolSet. It bridges chars and BCSymbol. The method returns nil if the symbol is not in the set (like 'member:' for NSSet, or 'objectForKey:' for NSDictionary). The BCSequence can use the BCSymbolSet to filter the right chars. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From mek at mekentosj.com Wed Jul 6 03:46:54 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 09:46:54 +0200 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> References: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> Message-ID: <0F871ACA-84F8-4764-B10D-DC6DB04ABEEA@mekentosj.com> I'm wondering to what extend we really need to cache the array of symbols, in fact when does one really need that array at all? Only if you ask a single position you get the symbol, in all other instances you don't want an array of symbols right, you want a BCSequence. So a subsequence -> a bcsequence is returned (again containing the char []), the reverse of a BCSequence -> a bcsequence is returned etc... Want to display a sequence in a view -> the stringrepresentation of the bcsequence object is given. When would you like an array of symbols? Alex On 6-jul-2005, at 0:34, Charles Parnot wrote: >> One question about design of this: should we think about a >> caching policy? >> Maybe have a boolean ivar that allows a sequence to retain its >> symbol array >> at an app developer's discretion (defaults to NO, but can be >> manually set to >> YES in cases where an app developer knows he'll need repeated >> access to >> individual symbols). It would let app developers make performance >> vs. >> memory decisions, instead of leaving them at the framework level. >> >> JT >> >> > > I thought of that too initially, but more recently I thought: > * if the sequence is immutable, the developer could cache it > outside of the framework, no need to do it in the framework > * if the sequence is mutable, caching means a lot of additional > code to keep the array in sync for all the methods where the > sequence changes; which probably means we should not do it now, but > leave it for the future *if* the need arises. > > Or the developer could be smart and know better than use NSArray > when performance is needed ;-) > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Wed Jul 6 03:53:30 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 09:53:30 +0200 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <7ad6585fce0fa0704bb033399582ac45@earthlink.net> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> Message-ID: <82865D10-83A1-4658-9681-801D1E9D0CA4@mekentosj.com> I agree that in some cases tools are the way to go, but one of the things I dislike most of the biojava project (and I remember from previous discussions that I was not alone) is in fact that you can't do a thing without needing a bunch of helper, factory, and/or tool objects. Please, let us stay far from that. It's really un-cocoa like, imagine NSString and you would need a "exporter tool" to save it to file, a "converter tool" to return it in a different encoding etc etc. Help!! > BCSequenceTool -> MW, ... > BCProteinTool -> pI, digest, ... > BCDNATool -> translate, ... > BCRNATool -> transcribe, ... Please not, the names of these tools alone are terrible!!!! I agree that for some relatively difficult things like a translation, digests or alignments, we need a tool, but even then I would like to have convenience methods that do the job of creating and handling the tool object in the background. For the rest, a simple MW should be one call and not need a tool! And IF we need a tool, make it specialized: BCDigester or what else instead of BCSequenceTool. Alex On 6-jul-2005, at 0:35, Koen van der Drift wrote: > > On Jul 5, 2005, at 12:32 PM, Charles Parnot wrote: > > >> My bottom line is the following: >> * it seems we should keep the subclass structure no matter what; >> and because Koen's concern has been addressed, he might agree with >> that... Koen? ;-) >> > > My main reason for liking the single class so much is that it is > easy to maintain and allows a high degree of modularity. Indeed the > code duplication has been addressed in great extent and I am sure > that everyone can agree that this makes maintenance of the sequence > classes much easier. However, when adding stuff into the > superclass, we always need to be aware of the fact that the code > needs to be adapted for one or more subclasses. > > What I also like a lot is that most functionality is kept is small > 'Tools' classes. We already made a start with that and it seems to > work nice (at least for me ;-) > > Now there are (at least) two ways to call the tools classes: > 1. We put sequence-specific wrappers in the subclasses (more or > less our current approach) > 2. We create a general tools class for DNA, RNA, Protein, that > contains wrappers to various tools classes, again sequence-specific > (biojava approach): > > BCSequenceTool -> MW, ... > BCProteinTool -> pI, digest, ... > BCDNATool -> translate, ... > BCRNATool -> transcribe, ... > > So for instance, [BCSequenceTools translation] does not exist, so > will never compile. In these cases, call BCDNATools to take care of > the operation. Another example, BCProteinTool should only contain > wrappers that calls the specific tool., so eg: > > pI = [BCProteinTool isoElectricPoint forSequence: mySeq] > > Now if mySeq is not a protein, then display a console message, plus > return a reasonable value. If we document this well, there > shouldn't be a problem. These things can always happen, also when > using typed sequences. I think it is fine if we leave some > responsibility with the user of the framework. They don't want > their program to misbehave so will be sure to catch these type of > errors. > > >> * except me, it seems everybody is confused by the placeholder >> class BCSequence; the idea was to try to have both options >> (BCSequence or the bunch of typed sequences), and decide at some >> point to dump one of the two or maybe keep both; it seems the >> consensus is now to dump one of the two, and based on the above, >> it seems logical to dump BCSequence; this is OK, there is very >> little code in there anyway, it was not very much work; just >> please keep it around a little while, I will archive it somewhere >> on my hard-drive (I don't want to rely only on the CVS server!!). >> And if we ever want to switch back to a single public class, it >> would not be very much work; the existing class structure would be >> easily amenable to a class cluster. >> > > So what will we be using instead then? > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com LabAssistant - Get your life organized! http://www.mekentosj.com/labassistant ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Wed Jul 6 03:54:46 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Wed, 6 Jul 2005 09:54:46 +0200 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <7ad6585fce0fa0704bb033399582ac45@earthlink.net> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> Message-ID: <445F2A2B-C868-4884-9A29-E3A7873108F0@bioworxx.com> Hey, > What I also like a lot is that most functionality is kept is small > 'Tools' classes. We already made a start with that and it seems to > work nice (at least for me ;-) > > Now there are (at least) two ways to call the tools classes: > 1. We put sequence-specific wrappers in the subclasses (more or > less our current approach) > 2. We create a general tools class for DNA, RNA, Protein, that > contains wrappers to various tools classes, again sequence-specific > (biojava approach): > > BCSequenceTool -> MW, ... > BCProteinTool -> pI, digest, ... > BCDNATool -> translate, ... > BCRNATool -> transcribe, ... > This is for sure a possible clean solution. > So for instance, [BCSequenceTools translation] does not exist, so > will never compile. In these cases, call BCDNATools to take care of > the operation. Another example, BCProteinTool should only contain > wrappers that calls the specific tool., so eg: > > pI = [BCProteinTool isoElectricPoint forSequence: mySeq] > > Now if mySeq is not a protein, then display a console message A exception would be much better, to allow the program or the programmer to react. > , plus return a reasonable value. Is there a reasonable value ? > If we document this well, there shouldn't be a problem. I don't like to legalize hacks, because of documenting them. > cheers, Phil From mek at mekentosj.com Wed Jul 6 03:54:00 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 09:54:00 +0200 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <0E56076D-7359-43DB-B612-6178BD80F980@mekentosj.com> Yes, I like this approach using NSMutableData, sounds nice! Alex On 6-jul-2005, at 0:36, Charles Parnot wrote: > > > On Jul 5, 2005, at 10:43 AM, John Timmer wrote: > > > >> >> >> >> >>> So, ready to move on :-) ? >>> >>> >>> >> >> If we are, it would seem to me that the next big step would be to >> change the >> internal representation of the sequences to a char array, and then >> write the >> bridging code to generate object representations on the fly. I >> don't trust >> my malloc capabilities enough to volunteer for the former, but >> I'll happily >> help on the latter. >> >> > > This is why I was leaning towards NSData. Actually, the real > benefit is with NSMutableData. Mallocing an array of char like > NSData would is trivial, but dealing with resizing is more > difficult. NSMutableData does it for us and deals with memory > issues. In addition, because it is a class cluster (sorry, the c > word!), it might be (or will be) optimized for different sizes of > data. Also, it is easy to write/read files, but also send data over > the network... Finally, it is Core Data compatible. The bottom line > is: if we malloc our own char[], we might end up creating NSData > object anyway. Getting the pointer to the array of bytes is > trivial: [myData bytes] or [myData mutableBytes]. > > charles > > NB: sorry, John, I also sent you by mistake a half-baked version > of that email! > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com iRNAi, do you? http://www.mekentosj.com/irnai ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Wed Jul 6 03:56:50 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 09:56:50 +0200 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <192F445D-89F4-4A90-8091-68FC22F012F3@mekentosj.com> > But I guess for the data operations we will still need the nitty > gritty C calls, including friendly stringpointers. Yep, but that's where we can use existing algorithms and implementations a lot... And of course, nothing holds you from getting the NSArray of symbols and do your things on that, the old way.... One of the things I forgot to say I liked about Charles NSData approach is that it also provides a great public accessor to the raw data of the BCSequence object, for instance for an alignment tool to work with the char[]s of the sequences to be aligned... Alex > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Wed Jul 6 03:59:27 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 09:59:27 +0200 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: <63C44698-DD25-4F87-9CF9-AFF5A2C24841@gmail.com> Message-ID: <655CF977-3157-443B-995F-09A6C1546814@mekentosj.com> On 6-jul-2005, at 1:08, Koen van der Drift wrote: > > On Jul 5, 2005, at 6:57 PM, Charles Parnot wrote: > > >> e.g. stringSequence[190] >> >> This is (almost) as friendly as perl ;-) >> >> > > If we only will iterate the strings, then it is probably not a > problem, but what about inserting, removing, copying? Should we > just use methods like strcpy, strcmp, etc? That is exactly where Charles sees NSMutableData to come in, or not charles? Don't think in terms of strings, but in terms of char arrays... Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows is a 32-bit patch to a 16-bit shell for an 8-bit operating system, written for a 4-bit processor by a 2- bit company without 1 bit of sense. ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From biococoa at bioworxx.com Wed Jul 6 04:17:13 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Wed, 6 Jul 2005 10:17:13 +0200 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> References: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> Message-ID: <392A1DBC-6DDA-4656-B366-76C18D6BEFF3@bioworxx.com> Hey Charles, Am 06.07.2005 um 00:34 schrieb Charles Parnot: >> One question about design of this: should we think about a >> caching policy? >> Maybe have a boolean ivar that allows a sequence to retain its >> symbol array >> at an app developer's discretion (defaults to NO, but can be >> manually set to >> YES in cases where an app developer knows he'll need repeated >> access to >> individual symbols). It would let app developers make performance >> vs. >> memory decisions, instead of leaving them at the framework level. >> >> JT >> >> > > I thought of that too initially, but more recently I thought: > * if the sequence is immutable, the developer could cache it > outside of the framework, no need to do it in the framework > * if the sequence is mutable, caching means a lot of additional > code to keep the array in sync for all the methods where the > sequence changes; which probably means we should not do it now, but > leave it for the future *if* the need arises. Do you mean, you don't want to implement a mutable version of a sequence class for now? I really need it, so we should build both, mutable and immutable sequences. Phil From kvddrift at earthlink.net Wed Jul 6 04:34:10 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 04:34:10 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <82865D10-83A1-4658-9681-801D1E9D0CA4@mekentosj.com> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> <82865D10-83A1-4658-9681-801D1E9D0CA4@mekentosj.com> Message-ID: On Jul 6, 2005, at 3:53 AM, Alexander Griekspoor wrote: > I agree that in some cases tools are the way to go, but one of the > things I dislike most of the biojava project?(and I remember from > previous discussions that I was not alone) is in fact that you can't > do a thing without needing a bunch of helper, factory, and/or tool > objects. Please, let us stay far from that. It's really un-cocoa like, > imagine NSString and you would need a "exporter tool" to save it to > file, a "converter tool" to return it in a different encoding etc etc. > Help!! > >> BCSequenceTool -> MW, ... >> BCProteinTool -> pI, digest, ... >> BCDNATool -> translate, ... >> BCRNATool -> transcribe, ... > Please not, the names of these tools alone are terrible!!!! I agree > that for some relatively difficult things like a translation, digests > or alignments, we need a tool, but even then I would like to have > convenience methods that do the job of creating and?handling the tool > object in the background. For the rest, a simple MW should be one call > and not need a tool! And IF we need a tool, make it specialized: > BCDigester or what else instead of BCSequenceTool. These names were only put in as an example, not a proposal :) Anyway, using tools classes makes the framework very modular and maintainable, in fact we are already using them. So I am not sure what your objection is against tools classes? Do you want to put all code in the subclasses? cheers, - Koen. From mek at mekentosj.com Wed Jul 6 04:41:18 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 6 Jul 2005 10:41:18 +0200 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> <82865D10-83A1-4658-9681-801D1E9D0CA4@mekentosj.com> Message-ID: <80D24326-D791-45FF-A575-B06872951D79@mekentosj.com> I don't have any objection against tool classes, but I have a lot of problems with the way things got out of hand in BioJava, there you can't do ANYTHING without tools. So names aside, I don't get what pI and digest do in 1 protein tool object. I plea to do very simplethings like MW and pI inside BCSequence where ever possible, and use specialized tools like a BCTranslatorTool and BCDigesterTool instead of one Proteintool and one DNAtool that bundle a lot of non- related stuff... Alex On 6-jul-2005, at 10:34, Koen van der Drift wrote: > > On Jul 6, 2005, at 3:53 AM, Alexander Griekspoor wrote: > > >> I agree that in some cases tools are the way to go, but one of the >> things I dislike most of the biojava project (and I remember from >> previous discussions that I was not alone) is in fact that you >> can't do a thing without needing a bunch of helper, factory, and/ >> or tool objects. Please, let us stay far from that. It's really un- >> cocoa like, imagine NSString and you would need a "exporter tool" >> to save it to file, a "converter tool" to return it in a different >> encoding etc etc. Help!! >> >> >>> BCSequenceTool -> MW, ... >>> BCProteinTool -> pI, digest, ... >>> BCDNATool -> translate, ... >>> BCRNATool -> transcribe, ... >>> >> Please not, the names of these tools alone are terrible!!!! I >> agree that for some relatively difficult things like a >> translation, digests or alignments, we need a tool, but even then >> I would like to have convenience methods that do the job of >> creating and handling the tool object in the background. For the >> rest, a simple MW should be one call and not need a tool! And IF >> we need a tool, make it specialized: BCDigester or what else >> instead of BCSequenceTool. >> > > These names were only put in as an example, not a proposal :) > > Anyway, using tools classes makes the framework very modular and > maintainable, in fact we are already using them. So I am not sure > what your objection is against tools classes? Do you want to put > all code in the subclasses? > > cheers, > > - Koen. > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows is a 32-bit patch to a 16-bit shell for an 8-bit operating system, written for a 4-bit processor by a 2- bit company without 1 bit of sense. ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Wed Jul 6 05:51:44 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 05:51:44 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <80D24326-D791-45FF-A575-B06872951D79@mekentosj.com> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> <82865D10-83A1-4658-9681-801D1E9D0CA4@mekentosj.com> <80D24326-D791-45FF-A575-B06872951D79@mekentosj.com> Message-ID: <527d6707ab325f13413736f9dad25bfc@earthlink.net> On Jul 6, 2005, at 4:41 AM, Alexander Griekspoor wrote: > I don't have any objection against tool classes, but I have a lot of > problems with the way things got out of hand in BioJava, there you > can't do ANYTHING without tools. So names aside, I don't get what pI > and digest do in 1 protein tool object. I plea to do very simplethings > like MW and pI inside BCSequence where ever possible, and use > specialized tools like a BCTranslatorTool and BCDigesterTool instead > of one Proteintool and one DNAtool that bundle a lot of non-related > stuff... I see now what you mean. Yes, I also think the BioJava class structure is way too complicated, and we should avoid that absolutely. Their way of using one general tool class at first seems logical, but you are right, it just add some extra steps. However, I still like the idea of separating structure and functionality classes, we can always use convenience methods to call the tools. Even for 'simple' operations such as MW and pI calculations. cheers, - Koen. From jtimmer at bellatlantic.net Wed Jul 6 08:25:51 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 08:25:51 -0400 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <0F871ACA-84F8-4764-B10D-DC6DB04ABEEA@mekentosj.com> Message-ID: > I'm wondering to what extend we really need to cache the array of symbols, in > fact when does one really need that array at all? Only if you ask a single > position you get the symbol, in all other instances you don't want an array of > symbols right, you want a BCSequence. So a subsequence -> a bcsequence is > returned (again containing the char[]), the reverse of a BCSequence -> a > bcsequence is returned etc... Want to display a sequence in a view -> the > stringrepresentation of the bcsequence object is given. When would you like an > array of symbols? > Alex > The symbols are the things that actually convey information. Say you want to generate all the information and statistics about a sequence (complement, reverse complement, MW, melting point, etc.) for display. Without a cached version, the sequence array would have to be recreated from data several times ? doing so would probably be the single largest time cost before the info could be displayed. JT _______________________________________________ This mind intentionally left blank -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtimmer at bellatlantic.net Wed Jul 6 08:33:32 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 08:33:32 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <527d6707ab325f13413736f9dad25bfc@earthlink.net> Message-ID: >> I don't have any objection against tool classes, but I have a lot of >> problems with the way things got out of hand in BioJava, there you >> can't do ANYTHING without tools. So names aside, I don't get what pI >> and digest do in 1 protein tool object. I plea to do very simplethings >> like MW and pI inside BCSequence where ever possible, and use >> specialized tools like a BCTranslatorTool and BCDigesterTool instead >> of one Proteintool and one DNAtool that bundle a lot of non-related >> stuff... > > I see now what you mean. Yes, I also think the BioJava class structure > is way too complicated, and we should avoid that absolutely. Their way > of using one general tool class at first seems logical, but you are > right, it just add some extra steps. > > However, I still like the idea of separating structure and > functionality classes, we can always use convenience methods to call > the tools. Even for 'simple' operations such as MW and pI calculations. I largely agree with Alex here, but I wonder where to reasonably draw the line. Something like complementation, which can be implemented in about 3 lines of code, shouldn't need to overhead of object creation and configuration of a tool - do it in BCNucleotide. Translation, which is a lot more annoying and can produce many different results based on the configuration (range, reading frame, etc.) clearly does. How do we choose when there's something in between? Set an arbitrary limit on number of lines of code or possible configurations? JT _______________________________________________ This mind intentionally left blank From charles.parnot at gmail.com Wed Jul 6 13:39:19 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 10:39:19 -0700 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <392A1DBC-6DDA-4656-B366-76C18D6BEFF3@bioworxx.com> References: <159D58C0-8783-49F9-B1C7-0C8E1081EFFB@gmail.com> <392A1DBC-6DDA-4656-B366-76C18D6BEFF3@bioworxx.com> Message-ID: <3B8D5805-EE75-49D4-A8C2-BD61CB851648@gmail.com> >> I thought of that too initially, but more recently I thought: >> * if the sequence is immutable, the developer could cache it >> outside of the framework, no need to do it in the framework >> * if the sequence is mutable, caching means a lot of additional >> code to keep the array in sync for all the methods where the >> sequence changes; which probably means we should not do it now, >> but leave it for the future *if* the need arises. >> > > Do you mean, you don't want to implement a mutable version of a > sequence class for now? I really need it, so we should build both, > mutable and immutable sequences. > > Phil No!! (actually, we don't even have immutable sequences at this point). I mean the following. In which situations would a developer using the framework need to repeatedly access the NSArray of BCSymbols and not want to cache it herself? That would mean the sequence keep changing and she keeps accessing the array. Then, to be efficient, the updating of the NSArray has to be done in parrallel to all the changes in the sequence. Which means additional code a little bit everywhere, in fact in every place where the sequence is modified. I don't think it is worth the trouble at this point. Even then, like Alex pointed out, I don't really see any situations where the above AND performance would be needed. In all other situations (immutable sequences or mutable sequences that the user of the framework will not modify while using the NSArray), the caching is easy not very useful, as the user will probably do it herself without even thinking about it: NSArray *myArray = [sequence symbolArray]; .... use the array ... I am not even sure we could call that 'caching' ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Wed Jul 6 13:44:14 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 10:44:14 -0700 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: References: Message-ID: On Jul 6, 2005, at 5:25 AM, John Timmer wrote: >> I'm wondering to what extend we really need to cache the array of >> symbols, in fact when does one really need that array at all? Only >> if you ask a single position you get the symbol, in all other >> instances you don't want an array of symbols right, you want a >> BCSequence. So a subsequence -> a bcsequence is returned (again >> containing the char[]), the reverse of a BCSequence -> a >> bcsequence is returned etc... Want to display a sequence in a view >> -> the stringrepresentation of the bcsequence object is given. >> When would you like an array of symbols? >> Alex >> > The symbols are the things that actually convey information. Say > you want to generate all the information and statistics about a > sequence (complement, reverse complement, MW, melting point, etc.) > for display. Without a cached version, the sequence array would > have to be recreated from data several times ? doing so would > probably be the single largest time cost before the info could be > displayed. > > JT Then, we should only cache it in one place: the accessor. Though for mutable sequences, we also need to remember to set the cached array to nil after modifications. Something like this: - (NSArray *)symbolArray { if ( cachedArray == nil ) { cachedArray = ... //generate the array (and retain it) } return cachedArray; } - (void)emptyCache { [cachedArray release]; cachedArray = nil; } - (void)appendSequence:(BCSequence *)otherSequence { //code to append the sequence ... //empty the cache: the array is not valid anymore!! [self emptyCache]; } what do you think? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Wed Jul 6 15:15:19 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 15:15:19 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <63C44698-DD25-4F87-9CF9-AFF5A2C24841@gmail.com> Message-ID: >>> >>> This is why I was leaning towards NSData. Actually, the real >>> benefit is with NSMutableData. Mallocing an array of char like >>> NSData would is trivial, but dealing with resizing is more >>> difficult. NSMutableData does it for us and deals with memory >>> issues. In addition, because it is a class cluster (sorry, the c >>> word!), it might be (or will be) optimized for different sizes of >>> data. Also, it is easy to write/read files, but also send data >>> over the network... Finally, it is Core Data compatible. The >>> bottom line is: if we malloc our own char[], we might end up >>> creating NSData object anyway. Getting the pointer to the array of >>> bytes is trivial: [myData bytes] or [myData mutableBytes]. >>> >>> >> >> But I guess for the data operations we will still need the nitty >> gritty C calls, including friendly stringpointers. >> I'm all for NSData, since I know how to use that ;). Next question - we're already using unichar's, which are 2 bytes - is that what we intend to stuff in the data, or should we think of recasting to unsigned chars? The second would probably greatly increase the efficiency of the code and make a lot more C code available to us, but we'd need to revamp a bunch of the classes. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Wed Jul 6 17:54:12 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 17:54:12 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> On Jul 6, 2005, at 3:15 PM, John Timmer wrote: > I'm all for NSData, since I know how to use that ;). Next question - > we're > already using unichar's, which are 2 bytes - is that what we intend to > stuff > in the data, or should we think of recasting to unsigned chars? The > second > would probably greatly increase the efficiency of the code and make a > lot > more C code available to us, but we'd need to revamp a bunch of the > classes. > My preference would be to use the (unsigned) chars. - Koen. From kvddrift at earthlink.net Wed Jul 6 17:56:09 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 17:56:09 -0400 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: References: Message-ID: <6d4cda544f3900b99d6df428c74388e6@earthlink.net> On Jul 6, 2005, at 1:44 PM, Charles Parnot wrote: > Then, we should only cache it in one place: the accessor. Though for > mutable sequences, we also need to remember to set the cached array to > nil after modifications. Something like this: > > - (NSArray *)symbolArray > { > if ( cachedArray == nil ) { > cachedArray = ... //generate the array (and retain it) > } > return cachedArray; > } > > - (void)emptyCache > { > [cachedArray release]; > cachedArray = nil; > } > > - (void)appendSequence:(BCSequence *)otherSequence > { > //code to append the sequence > ... > //empty the cache: the array is not valid anymore!! > [self emptyCache]; > } > > > what do you think? > Looks like a good start to me :) Maybe we can write the code in such a way that the BCSequence can be a delegate of another object (eg a view). - Koen. From kvddrift at earthlink.net Wed Jul 6 18:12:55 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 18:12:55 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <7ad6585fce0fa0704bb033399582ac45@earthlink.net> References: <83DFA0BD-B7A2-4B5A-B51F-818F977BEEB0@gmail.com> <7ad6585fce0fa0704bb033399582ac45@earthlink.net> Message-ID: <732dd77e972de6c0511e32167a3e2a60@earthlink.net> On Jul 5, 2005, at 6:35 PM, Koen van der Drift wrote: > My main reason for liking the single class so much is that it is easy > to maintain and allows a high degree of modularity. Just to follow up on that. I was digging through some of the mailing list archives of the biopython project and came upon some interesting messages about the same issue. Especially this one: http://portal.open-bio.org/pipermail/biopython/1999-October/000091.html One point that I haven't thought of yet, is that if you use typed sequences, you could easily keep adding subclasses, eg mutable/immutable, strict/nonstrict, ambiguous/unambiguous, and you quickly increase the number of subclasses and therefore maintenance. With only one sequence class and symbolsets for all different sequence types, this is a non-issue. In fact, you can even allow for user-defines sequence types, where the user supplies a symbolset. This could contain for instance some specialized, modified residues. Also, it would be very easy to change the sequencetype for a BCSequence object. This could for instance happen in a sequence editor, where user decides that the input is actually DNA, and not a protein. Anyway, if the majority wants to subclasses, then that's fine with me. However, lets stick with the current subclasses, and create variations through the symbolsets. cheers, - Koen. From kvddrift at earthlink.net Wed Jul 6 18:15:06 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 6 Jul 2005 18:15:06 -0400 Subject: [Biococoa-dev] BCSequenceRecord Message-ID: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> Hi, One of the things I also read about on the biopython mailinglist is that we could keep the BCSequenceXXX classes very light, and create a new class BCSequenceRecord, that is used to store all info from a datafile. In that case, the BCSequence is just one of the key/value pairs. cheers, - Koen. From charles.parnot at gmail.com Wed Jul 6 18:30:05 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 15:30:05 -0700 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> Message-ID: <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> On Jul 6, 2005, at 3:15 PM, Koen van der Drift wrote: > Hi, > > One of the things I also read about on the biopython mailinglist is > that we could keep the BCSequenceXXX classes very light, and create > a new class BCSequenceRecord, that is used to store all info from a > datafile. In that case, the BCSequence is just one of the key/value > pairs. > > cheers, > > - Koen. Sorry, Koen, but I have no idea what you are talking about here! charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Wed Jul 6 18:53:59 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 18:53:59 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: <732dd77e972de6c0511e32167a3e2a60@earthlink.net> Message-ID: > Anyway, if the majority wants to subclasses, then that's fine with me. > However, lets stick with the current subclasses, and create variations > through the symbolsets. Absolutely agreed. The whole point of the subclasses from my perspective was a guarantee of certain behavior and grouping of appropriate methods. All nucleotide sequences, no matter what their symbol set (strict, etc.), will produce a valid complement when the complement method is called. No protein sequence will, no matter what symbol set it's composed of (so we don't give it a complement method). So, from my perspective, there's no reason different symbol sets of the same general type need different classes. The only place where this gets tricky is if we want both mutable and non-mutable sequences. The ideal thing would be to have our existing subclass files act to subclass both a mutable and non-mutable generic sequence superclass, but I don't think that can be done. Is there anything about ObjC that would allow the equivalent functionality? JT _______________________________________________ This mind intentionally left blank From jtimmer at bellatlantic.net Wed Jul 6 18:55:43 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 18:55:43 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> Message-ID: > > On Jul 6, 2005, at 3:15 PM, John Timmer wrote: > >> I'm all for NSData, since I know how to use that ;). Next question - >> we're >> already using unichar's, which are 2 bytes - is that what we intend to >> stuff >> in the data, or should we think of recasting to unsigned chars? The >> second >> would probably greatly increase the efficiency of the code and make a >> lot >> more C code available to us, but we'd need to revamp a bunch of the >> classes. >> > > My preference would be to use the (unsigned) chars. Okay, that's three of us in agreement now. Is it time to do a project wide search for unichar and change it to unsigned char? If so, I'll volunteer. JT _______________________________________________ This mind intentionally left blank From charles.parnot at gmail.com Wed Jul 6 19:13:30 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 16:13:30 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <81FBF164-EBB7-4476-9693-F15DCBB4ECF4@gmail.com> Yes, char, no unichar. I am not sure how unichar got there in the first place. Probably from NSString. I hereby accept you being a volunteer... charles On Jul 6, 2005, at 3:55 PM, John Timmer wrote: >> >> On Jul 6, 2005, at 3:15 PM, John Timmer wrote: >> >> >>> I'm all for NSData, since I know how to use that ;). Next >>> question - >>> we're >>> already using unichar's, which are 2 bytes - is that what we >>> intend to >>> stuff >>> in the data, or should we think of recasting to unsigned chars? The >>> second >>> would probably greatly increase the efficiency of the code and >>> make a >>> lot >>> more C code available to us, but we'd need to revamp a bunch of the >>> classes. >>> >>> >> >> My preference would be to use the (unsigned) chars. >> > > Okay, that's three of us in agreement now. Is it time to do a > project wide > search for unichar and change it to unsigned char? If so, I'll > volunteer. > > JT > > > _______________________________________________ > This mind intentionally left blank > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Wed Jul 6 19:31:49 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 19:31:49 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <81FBF164-EBB7-4476-9693-F15DCBB4ECF4@gmail.com> Message-ID: > Yes, char, no unichar. I am not sure how unichar got there in the > first place. Probably from NSString. > > I hereby accept you being a volunteer... Fair enough, since I'm to blame for putting unichar in there in the first place. I put them there assuming that we'd mostly be converting from string->sequence, and Apple implies that NSStrings should be viewed as arrays of unichar's. It all made sense at the time.... JT _______________________________________________ This mind intentionally left blank From charles.parnot at gmail.com Wed Jul 6 19:45:33 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 16:45:33 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <0CD5F005-1213-40F2-9C9E-150C13FC42C8@gmail.com> Is there any way we could typedef it and keep it general, in case we want to change it later... Maybe that's not worth it? charles On Jul 6, 2005, at 4:31 PM, John Timmer wrote: >> Yes, char, no unichar. I am not sure how unichar got there in the >> first place. Probably from NSString. >> >> I hereby accept you being a volunteer... >> > > Fair enough, since I'm to blame for putting unichar in there in the > first > place. I put them there assuming that we'd mostly be converting from > string->sequence, and Apple implies that NSStrings should be viewed as > arrays of unichar's. It all made sense at the time.... > > JT > > _______________________________________________ > This mind intentionally left blank > > > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Wed Jul 6 19:57:27 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 06 Jul 2005 19:57:27 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <0CD5F005-1213-40F2-9C9E-150C13FC42C8@gmail.com> Message-ID: > Is there any way we could typedef it and keep it general, in case we > want to change it later... Maybe that's not worth it? > Seems dangerous - once we start working over the sequence using C functions as we apparently intend to do, changing the contents could be a nightmare. JT _______________________________________________ This mind intentionally left blank From charles.parnot at gmail.com Thu Jul 7 02:28:17 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 6 Jul 2005 23:28:17 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <0AB7789D-CBC9-4FC1-B980-18E3C0EB2B54@gmail.com> On Jul 6, 2005, at 4:57 PM, John Timmer wrote: >> Is there any way we could typedef it and keep it general, in case we >> want to change it later... Maybe that's not worth it? >> >> > > Seems dangerous - once we start working over the sequence using C > functions > as we apparently intend to do, changing the contents could be a > nightmare. > > JT Yes, that would also be silly in terms of the public interface. We want to return array of chars and not array of yet-another-typedef. char is good. There are 256 different ones, even enough to cover codons ;-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Thu Jul 7 20:26:03 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Thu, 7 Jul 2005 20:26:03 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: On Jul 6, 2005, at 6:30 PM, Charles Parnot wrote: >> One of the things I also read about on the biopython mailinglist is >> that we could keep the BCSequenceXXX classes very light, and create a >> new class BCSequenceRecord, that is used to store all info from a >> datafile. In that case, the BCSequence is just one of the key/value >> pairs. >> >> cheers, >> >> - Koen. > > Sorry, Koen, but I have no idea what you are talking about here! > What I am suggesting is that we keep the BCSequence class just for managing the sequence, such as creating, inserting, etc, and thus keep it very lightweight. To store any annotations and features we create a new class BCSequenceRecord. This is what will be created in the IO classes, and it will have a BCSequence object in one of its key-value pairs. So we remove NSMutableDictionary *annotations from BCAbstractSequence and move it to BCSequenceRecord. The advantage is that when you are just dealing with sequences, you want it to be as small as possible, and don't have the additional luggage of all the annotations. Does that make more sense? - Koen. From kvddrift at earthlink.net Thu Jul 7 20:29:47 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Thu, 7 Jul 2005 20:29:47 -0400 Subject: [Biococoa-dev] Using an untyped class sequence In-Reply-To: References: Message-ID: On Jul 6, 2005, at 6:53 PM, John Timmer wrote: > >> Anyway, if the majority wants to subclasses, then that's fine with me. >> However, lets stick with the current subclasses, and create variations >> through the symbolsets. > Absolutely agreed. Great :) Now how do we structure the current BCSequenceXXX classes? Do we keep the BCSequence/BCAbstractSequence pseudo class-cluster, or are we changing it to a superclass and three regular subclasses, with as musch as code in the superclass? I would vote for the latter, mainly because the former seems to be confusing in use. > > The only place where this gets tricky is if we want both mutable and > non-mutable sequences. The ideal thing would be to have our existing > subclass files act to subclass both a mutable and non-mutable generic > sequence superclass, but I don't think that can be done. Is there > anything > about ObjC that would allow the equivalent functionality? You mean multiple inheritance? No I don't think that is possible in ObjC. cheers, - Koen. From spam at bioinformatics.org Fri Jul 8 22:52:51 2005 From: spam at bioinformatics.org (Spam Eater at BiO) Date: Fri, 08 Jul 2005 22:52:51 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> References: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> Message-ID: <42CF3C03.2090904@bioinformatics.org> Caught in filters. ---------------------- On 6-jul-2005, at 23:54, Koen van der Drift wrote: > On Jul 6, 2005, at 3:15 PM, John Timmer wrote: > >> I'm all for NSData, since I know how to use that ;). Next question - >> we're >> already using unichar's, which are 2 bytes - is that what we intend >> to stuff >> in the data, or should we think of recasting to unsigned chars? The >> second >> would probably greatly increase the efficiency of the code and make a lot >> more C code available to us, but we'd need to revamp a bunch of the >> classes. >> >> > > My preference would be to use the (unsigned) chars. Yes, I couldn't agree more with Koen. One of the nice things of our new approach would be the ability to use common implementations and algorithms on the char array (encapsulated via NSData), and have less (4x less) ram/storage needed as if we would use pointers. If we now would start to use 2bytes unichars, first we would double the amount of space needed, and more important still have to do (a lot?) of adjustments to make it work with existing implementations. And given the fact that we don't have alphabets larger than 128 characters anyway (I presume), why would like to use unichars specifically? Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* From spam at bioinformatics.org Fri Jul 8 22:53:32 2005 From: spam at bioinformatics.org (Spam Eater at BiO) Date: Fri, 08 Jul 2005 22:53:32 -0400 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> Message-ID: <42CF3C2C.30800@bioinformatics.org> Caught in filters. --------------------- Hmm, although certainly nice to have a link between a view and ultimately the sequence, this is the task of a controller class, not the model... The model shouldn't know/care of the view. It might be nice, if we add controllers to the framework that do this though.. Alex On 6-jul-2005, at 23:56, Koen van der Drift wrote: > > On Jul 6, 2005, at 1:44 PM, Charles Parnot wrote: > > >> Then, we should only cache it in one place: the accessor. Though for >> mutable sequences, we also need to remember to set the cached array >> to nil after modifications. Something like this: >> >> - (NSArray *)symbolArray >> { >> if ( cachedArray == nil ) { >> cachedArray = ... //generate the array (and retain it) >> } >> return cachedArray; >> } >> >> - (void)emptyCache >> { >> [cachedArray release]; >> cachedArray = nil; >> } >> >> - (void)appendSequence:(BCSequence *)otherSequence >> { >> //code to append the sequence >> ... >> //empty the cache: the array is not valid anymore!! >> [self emptyCache]; >> } >> >> >> what do you think? >> >> > > > Looks like a good start to me :) Maybe we can write the code in such > a way that the BCSequence can be a delegate of another object (eg a view). > > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* From charles.parnot at gmail.com Sat Jul 9 00:19:44 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Fri, 8 Jul 2005 21:19:44 -0700 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <42CF3C2C.30800@bioinformatics.org> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> Message-ID: <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> > On 6-jul-2005, at 23:56, Koen van der Drift wrote: > >> >> Looks like a good start to me :) Maybe we can write the code in >> such a way that the BCSequence can be a delegate of another object >> (eg a view). >> >> >> - Koen. >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org > dev at bioinformatics.org> >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> >> > Hmm, although certainly nice to have a link between a view and > ultimately the sequence, this is the task of a controller class, > not the model... The model shouldn't know/care of the view. It > might be nice, if we add controllers to the framework that do this > though.. > Alex I am not sure what you both mean. Just to add to the confusion: BCSequence can have a delegate and still be a model, no? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sat Jul 9 00:25:35 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Fri, 8 Jul 2005 21:25:35 -0700 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: > What I am suggesting is that we keep the BCSequence class just for > managing the sequence, such as creating, inserting, etc, and thus > keep it very lightweight. To store any annotations and features we > create a new class BCSequenceRecord. This is what will be created > in the IO classes, and it will have a BCSequence object in one of > its key-value pairs. So we remove NSMutableDictionary *annotations > from BCAbstractSequence and move it to BCSequenceRecord. > > The advantage is that when you are just dealing with sequences, you > want it to be as small as possible, and don't have the additional > luggage of all the annotations. > > Does that make more sense? > > - Koen. You are suggesting to add yet another hierarchy of classes, 6 total ??!?? ;-) Anyway, we already had that discussion and the bottom line was: the annotations would be just one ivar = 4 bytes when nil, slightly more when empty dictionary, which is anyway very small compared to the sequence array of char. What we should provide are 'annotation-free' methods when spitting out subsequences, alignements, etc... e.g. having a optional last argument, e.g.: - (BCSequence *)subsequenceWithRange:(NSRange)aRange keepAnnotations: (BOOL)flag charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sat Jul 9 00:38:04 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 00:38:04 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> On Jul 9, 2005, at 12:25 AM, Charles Parnot wrote: > You are suggesting to add yet another hierarchy of classes, 6 total > ??!?? ;-) Nooooooo !!!!!! :D > > Anyway, we already had that discussion and the bottom line was: the > annotations would be just one ivar = 4 bytes when nil, slightly more > when empty dictionary, which is anyway very small compared to the > sequence array of char. I am suggesting to keep the annotations outside the BCSequence class. BCSequence is only a sequence of symbols nothing more, nothing less. BCSequenceRecord (only one class, no subclasses) is basically a dictionary with all the key-value pairs from a IO file and has-a BCSequence. in one of the key-value pairs. - Koen. From kvddrift at earthlink.net Sat Jul 9 00:39:27 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 00:39:27 -0400 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> Message-ID: On Jul 9, 2005, at 12:19 AM, Charles Parnot wrote: >> Hmm, although certainly nice to have a link between a view and >> ultimately the sequence, this is the task of a controller class, not >> the model... The model shouldn't know/care of the view. It might be >> nice, if we add controllers to the framework that do this though.. >> Alex > > I am not sure what you both mean. Just to add to the confusion: > BCSequence can have a delegate and still be a model, no? > Why not let BCSequence post notifications when something has changed? This way it is ot conected to a view, but and any object (including a controller or a view) can respond to it. - Koen. From charles.parnot at gmail.com Sat Jul 9 02:28:15 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Fri, 8 Jul 2005 23:28:15 -0700 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> Message-ID: <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> On Jul 8, 2005, at 9:39 PM, Koen van der Drift wrote: > > On Jul 9, 2005, at 12:19 AM, Charles Parnot wrote: > > >>> Hmm, although certainly nice to have a link between a view and >>> ultimately the sequence, this is the task of a controller class, >>> not the model... The model shouldn't know/care of the view. It >>> might be nice, if we add controllers to the framework that do >>> this though.. >>> Alex >>> >> >> I am not sure what you both mean. Just to add to the confusion: >> BCSequence can have a delegate and still be a model, no? >> >> > > Why not let BCSequence post notifications when something has > changed? This way it is ot conected to a view, but and any object > (including a controller or a view) can respond to it. > > > - Koen. Yes, the standard in Cocoa seems to have both a delegate and accompanying notifications, which the delegate always get without subscribing. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sat Jul 9 02:40:54 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Fri, 8 Jul 2005 23:40:54 -0700 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> Message-ID: On Jul 8, 2005, at 9:38 PM, Koen van der Drift wrote: > > On Jul 9, 2005, at 12:25 AM, Charles Parnot wrote: > > >> You are suggesting to add yet another hierarchy of classes, 6 >> total ??!?? ;-) >> > > Nooooooo !!!!!! :D > > >> >> Anyway, we already had that discussion and the bottom line was: >> the annotations would be just one ivar = 4 bytes when nil, >> slightly more when empty dictionary, which is anyway very small >> compared to the sequence array of char. >> > > I am suggesting to keep the annotations outside the BCSequence > class. BCSequence is only a sequence of symbols nothing more, > nothing less. BCSequenceRecord (only one class, no subclasses) is > basically a dictionary with all the key-value pairs from a IO file > and has-a BCSequence. in one of the key-value pairs. > > - Koen. OK, just one class. But, still, your initial argument seems to be that BCSequence should remain a light object, hence no annotations. However, an additional ivar for the annotations will not make it heavier if it is set to nil. Having more methods in the BCSequence object (to deal with annotations) will also not make a particular instance heavier. Now, many methods will check for the annotations when manipulating the sequence (subsequences, insertions, deletions,...). But again, there should always be a 'if annotations==nil' to skip the annotation manipulations and keep the method speedy and focused on the sequence string. So I would have to first be convinced that the annotations would be bad to have in BCSequence, before adding yet another class. Of course, I have to admit Apple decided to have two classes, NSString and NSAttributedString, in a somewhat similar case... though the structure is different, NSAttributedString is a subclass, which makes sense because subclass methods can handle attributes while the characters themselves are left to the superclass. Sorry, I am just thinking aloud with no too much logic and structure. Just throwing ideas and brainstorming... charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sat Jul 9 08:02:46 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 08:02:46 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> Message-ID: On Jul 9, 2005, at 2:40 AM, Charles Parnot wrote: > But, still, your initial argument seems to be that BCSequence should > remain a light object, hence no annotations. However, an additional > ivar for the annotations will not make it heavier if it is set to nil. > What I want to avoid is that when a large datafile is read (eg pdb or swissprot) not all the metadata is stuffed in to the annotations, making the sequence object heavy. However, sequence specific information such as C=C bonds, PTM, probably should be in the sequence object. Or would they then be called features? cheers, - Koen. From mek at mekentosj.com Sat Jul 9 13:58:23 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 19:58:23 +0200 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <42CF3C03.2090904@bioinformatics.org> References: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> <42CF3C03.2090904@bioinformatics.org> Message-ID: <5B29B50B-7434-4E95-B613-EA8CDC768F3B@mekentosj.com> I get certainly emails send from Spam Eater at BiO, what is this? And more important, did you guys got my messages in the end? Alex On 9-jul-2005, at 4:52, Spam Eater at BiO wrote: > Caught in filters. > ************************************************************** ** Alexander Griekspoor ** ************************************************************** The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com MacOS X: The power of UNIX with the simplicity of the Mac *************************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sat Jul 9 14:00:05 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 20:00:05 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: <0345BA6F-E4A3-4D2C-BF26-7C01D39BF805@mekentosj.com> Yep, that's the way, just indicate whether you would like a bare sequence (faster) or a completely annotated bcsequence be returned from either methods like the one charles mentiones, or tools like a translator tool that returns a sequence... Cheers, Alex On 9-jul-2005, at 6:25, Charles Parnot wrote: >> What I am suggesting is that we keep the BCSequence class just for >> managing the sequence, such as creating, inserting, etc, and thus >> keep it very lightweight. To store any annotations and features we >> create a new class BCSequenceRecord. This is what will be created >> in the IO classes, and it will have a BCSequence object in one of >> its key-value pairs. So we remove NSMutableDictionary *annotations >> from BCAbstractSequence and move it to BCSequenceRecord. >> >> The advantage is that when you are just dealing with sequences, >> you want it to be as small as possible, and don't have the >> additional luggage of all the annotations. >> >> Does that make more sense? >> >> - Koen. >> > > You are suggesting to add yet another hierarchy of classes, 6 > total ??!?? ;-) > > Anyway, we already had that discussion and the bottom line was: the > annotations would be just one ivar = 4 bytes when nil, slightly > more when empty dictionary, which is anyway very small compared to > the sequence array of char. > > What we should provide are 'annotation-free' methods when spitting > out subsequences, alignements, etc... e.g. having a optional last > argument, e.g.: > > - (BCSequence *)subsequenceWithRange:(NSRange)aRange > keepAnnotations:(BOOL)flag > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows vs Mac 65 million years ago, there were more dinosaurs than humans. Where are the dinosaurs now? ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sat Jul 9 14:36:15 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 20:36:15 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: I'm not sure, the question is how heavy the annotations will be, I think not too much. The question why I'm kind of reluctant is the idea for instance that NSAttributedString would be separated into an NSString and an NSStringAttributes object, somehow this doesn't make sense to make. It will also be more problematic to keep things in sync after editing, and would certainly require all kinds of notification and delegates "hacks" to make it work. In the end I don't think we even need light weight sequences so much. After all, we're passing around pointers to objects, so imagine a bcalignment, it will get the pointers to the sequences, it will use the ivar to the raw data to get access to the char arrays, and do it stuff. Whether or not the bcsequence object contains the annotations or not doesn't make a millisecond or kb or ram difference!! This is one of the nicest things of the char array setup in fact. I don't like the idea of a separate record for the annotations too much. The question in the end comes to whether we see the sequence as the center of our universe, also containing annotations, or whether we see the metadata as the most important part, with one of its attributes being the sequence data in the form of a bcsequence object. I don't get the overall picture. If you really want to do the separation, it would make even more sense to me to make BCSequence the metadata/ annotations object and have a separate BCSequenceData object... Alex On 8-jul-2005, at 2:26, Koen van der Drift wrote: > > On Jul 6, 2005, at 6:30 PM, Charles Parnot wrote: > > >>> One of the things I also read about on the biopython mailinglist >>> is that we could keep the BCSequenceXXX classes very light, and >>> create a new class BCSequenceRecord, that is used to store all >>> info from a datafile. In that case, the BCSequence is just one of >>> the key/value pairs. >>> >>> cheers, >>> >>> - Koen. >>> >> >> Sorry, Koen, but I have no idea what you are talking about here! >> >> > > What I am suggesting is that we keep the BCSequence class just for > managing the sequence, such as creating, inserting, etc, and thus > keep it very lightweight. To store any annotations and features we > create a new class BCSequenceRecord. This is what will be created > in the IO classes, and it will have a BCSequence object in one of > its key-value pairs. So we remove NSMutableDictionary *annotations > from BCAbstractSequence and move it to BCSequenceRecord. > > The advantage is that when you are just dealing with sequences, you > want it to be as small as possible, and don't have the additional > luggage of all the annotations. > > Does that make more sense? > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com iRNAi, do you? http://www.mekentosj.com/irnai ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sat Jul 9 14:08:37 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 20:08:37 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> Message-ID: <70B585AD-C729-47D2-BB7F-75C5CC3DB1E1@mekentosj.com> The discussion feature vs annotations is another one to tackle, I think an annotation is independent of position, a feature is. Author, creation date, species etc would be an annotation, phosphorylation, alpha-helix, etc would be a feature. But there are many things in the gray area. Again, I'm with charles, it doesn't make sense to separate annotations/features from sequences because we would end up with more lightweight sequences, we don't. In the end we have to keep both around, in one or two objects doesn't make a difference in ram (in fact, it's probably more because of overlap). In addition, it makes syncing much more difficult. If you are not interested in the annotations, don't touch them, we use pointers anyway so passing around of sequences with or without annotations doesn't make a difference. Finally, the particular problem you mention (large datafiles) has nothing to do with this problem, in your case simply ask the importer to not handle the metadata and only import the sequence (an option we definitely have to build in into the seqIO classes). Plus, BCSequence should have methods to ask a (sub)version of the sequence with or WITHOUT annotations. Charles gave an example already... Alex On 9-jul-2005, at 14:02, Koen van der Drift wrote: > > On Jul 9, 2005, at 2:40 AM, Charles Parnot wrote: > > >> But, still, your initial argument seems to be that BCSequence >> should remain a light object, hence no annotations. However, an >> additional ivar for the annotations will not make it heavier if it >> is set to nil. >> >> > > What I want to avoid is that when a large datafile is read (eg pdb > or swissprot) not all the metadata is stuffed in to the > annotations, making the sequence object heavy. However, sequence > specific information such as C=C bonds, PTM, probably should be in > the sequence object. Or would they then be called features? > > cheers, > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows vs Mac 65 million years ago, there were more dinosaurs than humans. Where are the dinosaurs now? ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sat Jul 9 14:36:16 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 20:36:16 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <89ff7a2153e533c9ac047bc50c5c71e4@earthlink.net> Message-ID: >> Anyway, we already had that discussion and the bottom line was: >> the annotations would be just one ivar = 4 bytes when nil, >> slightly more when empty dictionary, which is anyway very small >> compared to the sequence array of char. >> > > I am suggesting to keep the annotations outside the BCSequence > class. BCSequence is only a sequence of symbols nothing more, > nothing less. BCSequenceRecord (only one class, no subclasses) is > basically a dictionary with all the key-value pairs from a IO file > and has-a BCSequence. in one of the key-value pairs. Well, i certainly hope BCSequence is more than a sequence of symbols, otherwise just use NSArray.... Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Claiming that the Macintosh is inferior to Windows because most people use Windows, is like saying that all other restaurants serve food that is inferior to McDonalds ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sat Jul 9 14:36:32 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sat, 9 Jul 2005 20:36:32 +0200 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> Message-ID: Wait! Ok, let me begin with Charles question what I'm talking about. The answer is yes, the sequence can have a delegate and still be a model yes. BUT, let me try to convince what I meant with my remark and how we can do it in a better way. First, the model-view-controller (MVC) design pattern. BCSequence is our model class, it contains the data. The idea behind this pattern is that the model never has any idea about the view its data is presented in, and is thus also not responsible for updates, data changes etc. This is all controlled by the controller which insulated both from each other and keeps both in sync. So if for instance the users edits the sequence in the gui, this should be picked up by the controller which makes sure the edit is forwarded to the model object. Thus, it would not make sense to have the model for instance being a delegate of the view, hence bypassing the controller. Also, why should the controller be the delegate of the model? The ONLY one changing the model is the controller, so if it does it should know already right! Now, I admit there are situations that this is not completely true. Imagine a alignment controller A which aligns several sequences and communicates this via an alignment view. Now the same sequences could potentially be altered by a different controller B which presents the user with a standard sequence editing window. Now of course we would like controller A to redo the alignment and update the view. According to the MVC pattern it would be controller B's task to initiate the update of A (via delegation or notification), so officially no problem and still no need for the model to contain notifications/delegations. Still, sometimes it would be handy to be able to monitor the changes made to a model object if multiple controllers potentially edit the same model object. Now what, notifications, delegations. I don't like the idea of sending "public" notifications into the air, and remember if we create or edit sequences a lot (which you do with model objects), this could create a lot of overhead, and could clog the notification system). Now delegates is more direct, but then again, you could only have one delegate, a severe limit for situations as described above!! In addition, in both case to make it work the code (most notable the accessors) will have to be polluted with delegation invocations (plus a check whether there actually is a delegate, plus a check whether the delegate responds to that call) or notification dispatches. Aaaaaahhh!!! Luckily there's something better, called Key-value-observing (KVO), allowing exactly what you guys are looking for, available in 10.3 or higher (now, did we all agree already that 10.3 is our lower limit). It's part of the famous "binding" system, and allows controllers to register themselves as observers for certain properties of the model object and get notifications when these changes so they can do their things. Sounds pretty good huh! How does it work? We just have to be key-value-coding (KVC) compliant, which we are already, hooray! NSObject provides automatic support! If we decide that at somepoints we need more control, there are additional methods that can be called to trigger observers like didChangeValueForKey. I strongly suggest to not go for delegates, not for notifications, but for KVO, which means we don't have to do anything, nice huh! Here's more info on KVO: http://developer.apple.com/documentation/Cocoa/Conceptual/ KeyValueObserving/KeyValueObserving.html Cheers, Alex On 9-jul-2005, at 8:28, Charles Parnot wrote: > On Jul 8, 2005, at 9:39 PM, Koen van der Drift wrote: > > >> >> On Jul 9, 2005, at 12:19 AM, Charles Parnot wrote: >> >> >> >>>> Hmm, although certainly nice to have a link between a view and >>>> ultimately the sequence, this is the task of a controller class, >>>> not the model... The model shouldn't know/care of the view. It >>>> might be nice, if we add controllers to the framework that do >>>> this though.. >>>> Alex >>>> >>>> >>> >>> I am not sure what you both mean. Just to add to the confusion: >>> BCSequence can have a delegate and still be a model, no? >>> >>> >>> >> >> Why not let BCSequence post notifications when something has >> changed? This way it is ot conected to a view, but and any object >> (including a controller or a view) can respond to it. >> >> >> - Koen. >> > > Yes, the standard in Cocoa seems to have both a delegate and > accompanying notifications, which the delegate always get without > subscribing. > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.parnot at gmail.com Sat Jul 9 15:11:59 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sat, 9 Jul 2005 12:11:59 -0700 Subject: [Biococoa-dev] KVO vs. delegate In-Reply-To: References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> Message-ID: <7F51A234-79F5-4877-B791-2B1C1567A94C@gmail.com> I agree with you Alex, that there won't be too many cases where the BCSequence needs to notify about what it does. In the case of model objects, delegates and notifications are good for callbacks. For instance, IO and networking, when you want to be notified by the model object (e.g. a file handle, a server, ...) when something happens, asynchronously. In the case of BCSequence, there is nothing like that... at least at present. Maybe in the future, we will have tasks performed asynchronously, for instance alignements or lengthy calculations. In these cases, delegates and notifications are good. I am not even sure that would be BCSequence dealing with this, probably BCTools. So I am saying KVO should be enough for BCSequence. Now, just as a general discussion, I just want to emphasize the limitations of KVO, actually more the PITA aspect of it. I am fighting a lot with the Xgrid APIs (the fight is almost over, and I think I am winning). The reason is there is no delegate (except in 2 classes) or notifications. You are supposed to do everything with KVO. And you have all sort of asynchronous stuff happening: every action done on a job, a server, a grid, results,... is sent to the server, and you have to 'observe' a different key and a different object for all. The problem is: there is only one callback method, which is: - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id) object change:(NSDictionary *)change context:(void *)context So all the event end up calling that method, and you have to go with a bunch of 'if' to test all the possibilities. Also, the method tends to be called very often depending how the code was written, and depending on what the bindings and UI is doing, which you can't control. For instance, I see the array of jobs of a grid go from empty array, then nil, then empty array, then half full, then half empty, then all the jobs are here. To avoid all the 'if's, you could have a different class for each event, each with their own code, but that is not feasible when you have 20 different things happening, because then you need 20 classes. Conversely, with the delegate design, each event is dispatched to a different method, and the code is much easier to write, debug and maintain. The code for the Xgrid classes that have a delegate was a breeze to write and test. This is all to show you where the line between a delegate/ notification design and a KVO-only design is, and what the user of the framework will potentially face. Of course, the BIG advantage of KVO is bindings. And this is really very cool. I am using it a lot now, together with Core Data, and it is really fun. BTW, John, we will have to talk about Core Data at some point ;-) charles On Jul 9, 2005, at 11:36 AM, Alexander Griekspoor wrote: > Wait! > > Ok, let me begin with Charles question what I'm talking about. The > answer is yes, the sequence can have a delegate and still be a > model yes. > BUT, let me try to convince what I meant with my remark and how we > can do it in a better way. > First, the model-view-controller (MVC) design pattern. BCSequence > is our model class, it contains the data. The idea behind this > pattern is that the model never has any idea about the view its > data is presented in, and is thus also not responsible for updates, > data changes etc. This is all controlled by the controller which > insulated both from each other and keeps both in sync. So if for > instance the users edits the sequence in the gui, this should be > picked up by the controller which makes sure the edit is forwarded > to the model object. Thus, it would not make sense to have the > model for instance being a delegate of the view, hence bypassing > the controller. Also, why should the controller be the delegate of > the model? The ONLY one changing the model is the controller, so if > it does it should know already right! > Now, I admit there are situations that this is not completely true. > Imagine a alignment controller A which aligns several sequences and > communicates this via an alignment view. Now the same sequences > could potentially be altered by a different controller B which > presents the user with a standard sequence editing window. Now of > course we would like controller A to redo the alignment and update > the view. According to the MVC pattern it would be controller B's > task to initiate the update of A (via delegation or notification), > so officially no problem and still no need for the model to contain > notifications/delegations. Still, sometimes it would be handy to be > able to monitor the changes made to a model object if multiple > controllers potentially edit the same model object. > Now what, notifications, delegations. I don't like the idea of > sending "public" notifications into the air, and remember if we > create or edit sequences a lot (which you do with model objects), > this could create a lot of overhead, and could clog the > notification system). Now delegates is more direct, but then again, > you could only have one delegate, a severe limit for situations as > described above!! In addition, in both case to make it work the > code (most notable the accessors) will have to be polluted with > delegation invocations (plus a check whether there actually is a > delegate, plus a check whether the delegate responds to that call) > or notification dispatches. Aaaaaahhh!!! > Luckily there's something better, called Key-value-observing (KVO), > allowing exactly what you guys are looking for, available in 10.3 > or higher (now, did we all agree already that 10.3 is our lower > limit). It's part of the famous "binding" system, and allows > controllers to register themselves as observers for certain > properties of the model object and get notifications when these > changes so they can do their things. Sounds pretty good huh! How > does it work? We just have to be key-value-coding (KVC) compliant, > which we are already, hooray! NSObject provides automatic support! > If we decide that at somepoints we need more control, there are > additional methods that can be called to trigger observers like > didChangeValueForKey. I strongly suggest to not go for delegates, > not for notifications, but for KVO, which means we don't have to do > anything, nice huh! Here's more info on KVO: > http://developer.apple.com/documentation/Cocoa/Conceptual/ > KeyValueObserving/KeyValueObserving.html > > Cheers, > Alex > > > > > > On 9-jul-2005, at 8:28, Charles Parnot wrote: > >> On Jul 8, 2005, at 9:39 PM, Koen van der Drift wrote: >> >> >>> >>> On Jul 9, 2005, at 12:19 AM, Charles Parnot wrote: >>> >>> >>> >>>>> Hmm, although certainly nice to have a link between a view and >>>>> ultimately the sequence, this is the task of a controller >>>>> class, not the model... The model shouldn't know/care of the >>>>> view. It might be nice, if we add controllers to the framework >>>>> that do this though.. >>>>> Alex >>>>> >>>>> >>>> >>>> I am not sure what you both mean. Just to add to the confusion: >>>> BCSequence can have a delegate and still be a model, no? >>>> >>>> >>>> >>> >>> Why not let BCSequence post notifications when something has >>> changed? This way it is ot conected to a view, but and any object >>> (including a controller or a view) can respond to it. >>> >>> >>> - Koen. >>> >> >> Yes, the standard in Cocoa seems to have both a delegate and >> accompanying notifications, which the delegate always get without >> subscribing. >> >> charles >> >> -- >> Xgrid-at-Stanford >> Help science move fast forward: >> http://cmgm.stanford.edu/~cparnot/xgrid-stanford >> >> Charles Parnot >> charles.parnot at gmail.com >> >> >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> > > ********************************************************* > ** Alexander Griekspoor ** > ********************************************************* > The Netherlands Cancer Institute > Department of Tumorbiology (H4) > Plesmanlaan 121, 1066 CX, Amsterdam > Tel: + 31 20 - 512 2023 > Fax: + 31 20 - 512 2029 > AIM: mekentosj at mac.com > E-mail: a.griekspoor at nki.nl > Web: http://www.mekentosj.com > > Microsoft is not the answer, > Microsoft is the question, > NO is the answer > > ********************************************************* > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sat Jul 9 15:16:37 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 15:16:37 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> Message-ID: <086b453787e54ee033705ff6b92ce9ce@earthlink.net> On Jul 9, 2005, at 2:36 PM, Alexander Griekspoor wrote: > I'm not sure, the question is how heavy the annotations will be, I > think not too much. The question why I'm kind of reluctant is the idea > for instance that NSAttributedString would be separated into an > NSString and an NSStringAttributes object, somehow this doesn't make > sense to make. It will also be more problematic to keep things in sync > after editing, and would certainly require all kinds of notification > and delegates "hacks" to make it work. In the end I don't think we > even need light weight sequences so much. After all, we're passing > around pointers to objects, so imagine a bcalignment, it will get the > pointers to the sequences, it will use the ivar to the raw data to get > access to the char arrays, and do it stuff. Whether or not the > bcsequence object contains the annotations or not doesn't make a > millisecond or kb or ram difference!! This is one of the nicest things > of the char array setup in fact. I don't like the idea of a separate > record for the annotations too much. The question in the end comes to > whether we see the sequence as the center of our universe, also > containing annotations, or whether we see the metadata as the most > important part, with one of its attributes being the sequence data in > the form of a bcsequence object. I don't get the overall picture. If > you really want to do the separation, it would make even more sense to > me to make BCSequence the metadata/annotations object and have a > separate BCSequenceData object... > Alex Hi Alex, I think you convinced me :) Let's start implementing the char array and BCSequence structure first. Are we merging BCSeqeunce and BCAbstractSequence so that we just will have a regular subclass - superclass structure? For the ivars I suggest to make them immutable: const char *sequence NSArray *symbolArray And what will be the role of the BCParser class? It's still not very clear to me how this will fit into the BCFOundation picture. cheers, - Koen. From mek at mekentosj.com Sat Jul 9 20:06:45 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 10 Jul 2005 02:06:45 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <086b453787e54ee033705ff6b92ce9ce@earthlink.net> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> Message-ID: <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> You think?! Just kidding ;-) About the implementation, even if we use single byte const chars, I still like the idea a lot of Charles to use NSMutableData as the internal datastore and provide an accessor to it. The NSData -bytes method casted to const char will give you direct access to the c array. Indeed the ivars can best be held read only as all editing should occur through class methods that also do the syncing of features/annotations. I would have absolutely no problem (certainly for performance reasons) to just have the accessor be - (NSData *) sequenceData; while internally it really is an NSMutableData. John uses to be very picky about this one, but it is fair I think that the user should respect that things only work for what we tell that we hand him, not what he really gets. I know that also Apple does this, their methods sometimes tell you you will get an NSArray, while in fact it is a NSMutableArray. But don't blame them if you start editing the array and things behave weird. Again the main reason for going for NSMutableData instead of directly using a char array is that we don't have to do the memory allocation/ management. For instance from the NSMutableData documents: - (void)replaceBytesInRange:(NSRange)range withBytes:(const void *) replacementBytes length:(unsigned)replacementLength Replaces the range within the contents of the receiver with replacementBytes. If the length of range is not equal to replacementLength, the receiver is resized to accommodate the new bytes. Any bytes past range in the receiver are shifted to accommodate the new bytes. This is just fantastic instead of having to do all malloc and pointer shifting etc ourselves! Cheers, Alex Ps. Any clues on the spam stuff, I think none of my 5 messages have yet been posted to the list while you Koen for instance got the direct one immediately so it seems. On 9-jul-2005, at 21:16, Koen van der Drift wrote: > On Jul 9, 2005, at 2:36 PM, Alexander Griekspoor wrote: > > >> I'm not sure, the question is how heavy the annotations will be, I >> think not too much. The question why I'm kind of reluctant is the >> idea for instance that NSAttributedString would be separated into >> an NSString and an NSStringAttributes object, somehow this doesn't >> make sense to make. It will also be more problematic to keep >> things in sync after editing, and would certainly require all >> kinds of notification and delegates "hacks" to make it work. In >> the end I don't think we even need light weight sequences so much. >> After all, we're passing around pointers to objects, so imagine a >> bcalignment, it will get the pointers to the sequences, it will >> use the ivar to the raw data to get access to the char arrays, and >> do it stuff. Whether or not the bcsequence object contains the >> annotations or not doesn't make a millisecond or kb or ram >> difference!! This is one of the nicest things of the char array >> setup in fact. I don't like the idea of a separate record for the >> annotations too much. The question in the end comes to whether we >> see the sequence as the center of our universe, also containing >> annotations, or whether we see the metadata as the most important >> part, with one of its attributes being the sequence data in the >> form of a bcsequence object. I don't get the overall picture. If >> you really want to do the separation, it would make even more >> sense to me to make BCSequence the metadata/annotations object and >> have a separate BCSequenceData object... >> Alex >> > > Hi Alex, > > I think you convinced me :) > > Let's start implementing the char array and BCSequence structure > first. Are we merging BCSeqeunce and BCAbstractSequence so that we > just will have a regular subclass - superclass structure? For the > ivars I suggest to make them immutable: > > const char *sequence > NSArray *symbolArray > > > And what will be the role of the BCParser class? It's still not > very clear to me how this will fit into the BCFOundation picture. > > cheers, > > - Koen. > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sat Jul 9 20:16:20 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 20:16:20 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> Message-ID: <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> On Jul 9, 2005, at 8:06 PM, Alexander Griekspoor wrote: > About the implementation, even if we use single byte const chars, I > still like the idea a lot of Charles to use NSMutableData as the > internal datastore and provide an accessor to it. The NSData -bytes > method casted to const char will give you direct access to the c > array. Indeed the ivars can best be held read only as all editing > should occur through class methods that also do the syncing of > features/annotations. I would have absolutely no problem (certainly > for performance reasons) to just have the accessor be - (NSData > *)sequenceData; while internally it really is an NSMutableData. John > uses to be very picky about this one, but it is fair I think that the > user should respect that things only work for what we tell that we > hand him, not what he really gets. I know that also Apple does this, > their methods sometimes tell you you will get an NSArray, while in > fact it is a NSMutableArray. But don't blame them if you start editing > the array and things behave weird.? > > Again the main reason for going for NSMutableData instead of directly > using a char array is that we don't have to do the memory > allocation/management. For instance from the NSMutableData??documents: > > > - (void)replaceBytesInRange:(NSRange)range withBytes:(const void > *)replacementBytes length:(unsigned)replacementLength > Replaces the range within the contents of the receiver with > replacementBytes. If the length of range is not equal to > replacementLength, the receiver is resized to accommodate the new > bytes. Any bytes past range in the receiver are shifted to accommodate > the new bytes. > > This is just fantastic instead of having to do all malloc and pointer > shifting etc ourselves! I like the NSData approach. So if we go for NSMutableData, we will have mutable sequences, correct? Are we aso going to add immutable sequences? > Ps. Any clues on the spam stuff, I think none of my 5 messages have > yet been posted to the list while you Koen for instance got the direct > one immediately so it seems.? I saw 2 spam messages, and 3 messages from you Alex a few hours ago. - Koen. From kvddrift at earthlink.net Sat Jul 9 21:12:18 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 21:12:18 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <5B29B50B-7434-4E95-B613-EA8CDC768F3B@mekentosj.com> References: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> <42CF3C03.2090904@bioinformatics.org> <5B29B50B-7434-4E95-B613-EA8CDC768F3B@mekentosj.com> Message-ID: <6f2e5ef370c1aa7f92a47d6ec3d13f97@earthlink.net> Maybe you can see them in the archives : http://bioinformatics.org/pipermail/biococoa-dev/2005-July/date.html Just received 5 messages from you. - Koen. On Jul 9, 2005, at 1:58 PM, Alexander Griekspoor wrote: > I get certainly emails send from Spam Eater at BiO, what is this? And > more important, did you guys got my messages in the end? > Alex > > On 9-jul-2005, at 4:52, Spam Eater at BiO wrote: > >> Caught in filters. >> > > ************************************************************** > ? ? ? ? ? ? ? ? ? ? ? ** Alexander Griekspoor ** > ************************************************************** > ?? ? ? ? ? ? ? ? The Netherlands Cancer Institute > ?? ? ? ? ? ? ? ? Department of Tumorbiology (H4) > ? ? ? ? ? ? Plesmanlaan 121, 1066 CX, Amsterdam > ?? ? ? ? ? ? ? ? ? ? ? Tel:? + 31 20 - 512 2023 > ?? ? ? ? ? ? ? ? ? ? ? Fax:? + 31 20 - 512 2029 > ? ? ? ? ? ? ? ? ? ? ? AIM: mekentosj at mac.com > ? ? ? ? ? ? ? ? ? ? ? E-mail: a.griekspoor at nki.nl > ?? ? ? ? ? ? ? ? ? Web: http://www.mekentosj.com > > MacOS X: The power of UNIX with the simplicity of the Mac > > *************************************************************** > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev From kvddrift at earthlink.net Sat Jul 9 21:15:40 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 9 Jul 2005 21:15:40 -0400 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> Message-ID: <17c5e2862bfba9543f27b4a0aa1a8e5b@earthlink.net> On Jul 9, 2005, at 2:36 PM, Alexander Griekspoor wrote: > I strongly suggest to not go for delegates, not for notifications, > but for KVO, which means we don't have to do anything, nice huh! I especially like the second part :) So were going for 10.3 as a minimum requirement? I think this was brought up earlier this week too, but I forgot which thread. - Koen. From mek at mekentosj.com Sun Jul 10 04:16:48 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 10 Jul 2005 10:16:48 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> Message-ID: <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> On 10-jul-2005, at 2:16, Koen van der Drift wrote: > > I like the NSData approach. So if we go for NSMutableData, we will > have mutable sequences, correct? Are we aso going to add immutable > sequences? Here I am not sure, John had clear opinions in the past about this as far as I can recall. The "problem" is in the subclasses design. Ideally a mutable sequence object would be either a subclass of the immutable one which adds the editing possibilities, or we do something alike NSString vs NSMutableString class cluster approach (of which I don't know much, but I believe Charles or Phil can tell more.). Anyway, I would have no problem to make the public method return an NSData but the private property be an NSMutableData, also for the immutable sequence. But if there is a way to have the immutable bcsequence have an NSData property and the mutable version have an NSMutableData then that would be more elegant obviously... Cheers, Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sun Jul 10 04:18:15 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 10 Jul 2005 10:18:15 +0200 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <6f2e5ef370c1aa7f92a47d6ec3d13f97@earthlink.net> References: <31285ebbf9fe8bc43e5696d4f5c57da7@earthlink.net> <42CF3C03.2090904@bioinformatics.org> <5B29B50B-7434-4E95-B613-EA8CDC768F3B@mekentosj.com> <6f2e5ef370c1aa7f92a47d6ec3d13f97@earthlink.net> Message-ID: <5C88331B-8C86-45FE-98AA-145C7FB62B1C@mekentosj.com> Well, it seems to be a one time only, no clue why. Thanks! Alex On 10-jul-2005, at 3:12, Koen van der Drift wrote: > Maybe you can see them in the archives : http://bioinformatics.org/ > pipermail/biococoa-dev/2005-July/date.html > > Just received 5 messages from you. > > > - Koen. > > > On Jul 9, 2005, at 1:58 PM, Alexander Griekspoor wrote: > > >> I get certainly emails send from Spam Eater at BiO, what is this? >> And more important, did you guys got my messages in the end? >> Alex >> >> On 9-jul-2005, at 4:52, Spam Eater at BiO wrote: >> >> >>> Caught in filters. >>> >>> >> >> ************************************************************** >> ** Alexander Griekspoor ** >> ************************************************************** >> The Netherlands Cancer Institute >> Department of Tumorbiology (H4) >> Plesmanlaan 121, 1066 CX, Amsterdam >> Tel: + 31 20 - 512 2023 >> Fax: + 31 20 - 512 2029 >> AIM: mekentosj at mac.com >> E-mail: a.griekspoor at nki.nl >> Web: http://www.mekentosj.com >> >> MacOS X: The power of UNIX with the simplicity of the Mac >> >> *************************************************************** >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sun Jul 10 04:19:56 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 10 Jul 2005 10:19:56 +0200 Subject: [Biococoa-dev] Caching the NSArray of BCSymbol In-Reply-To: <17c5e2862bfba9543f27b4a0aa1a8e5b@earthlink.net> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> <17c5e2862bfba9543f27b4a0aa1a8e5b@earthlink.net> Message-ID: I would say yes, it makes sense to me, but since KVO comes for free automatically, WE are still 10.2 compliant! Only programmers who would like to register for changes can only do that if they build a 10.3 or higher app... Alex On 10-jul-2005, at 3:15, Koen van der Drift wrote: > > On Jul 9, 2005, at 2:36 PM, Alexander Griekspoor wrote: > > >> I strongly suggest to not go for delegates, not for >> notifications, but for KVO, which means we don't have to do >> anything, nice huh! >> > > I especially like the second part :) > > So were going for 10.3 as a minimum requirement? I think this was > brought up earlier this week too, but I forgot which thread. > > - Koen. > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com The requirements said: Windows 2000 or better. So I got a Macintosh. ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Sun Jul 10 04:23:31 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Sun, 10 Jul 2005 10:23:31 +0200 Subject: [Biococoa-dev] KVO vs. delegate In-Reply-To: <7F51A234-79F5-4877-B791-2B1C1567A94C@gmail.com> References: <67C60B96-2C12-4FAE-85E0-76AD74E125B3@mekentosj.com> <42CF3C2C.30800@bioinformatics.org> <6DE8481F-C3D6-4A57-8106-7D51A5637A89@gmail.com> <012FA9DF-58B8-4A88-A084-B3B0867C5A58@gmail.com> <7F51A234-79F5-4877-B791-2B1C1567A94C@gmail.com> Message-ID: Yes, it's definitely a pity that they could come up with a kind of standard delegate syntax you could use to catch the KVO notifications, like KVC also picks up general things like setColor, or isExtensible. It would have been nice if you could have the according -colorChanged or extensibleChanged methods being called upon a change in values. Alas.... I don't like the if if if if if's either.. Alex On 9-jul-2005, at 21:11, Charles Parnot wrote: > I agree with you Alex, that there won't be too many cases where the > BCSequence needs to notify about what it does. In the case of model > objects, delegates and notifications are good for callbacks. For > instance, IO and networking, when you want to be notified by the > model object (e.g. a file handle, a server, ...) when something > happens, asynchronously. In the case of BCSequence, there is > nothing like that... at least at present. Maybe in the future, we > will have tasks performed asynchronously, for instance alignements > or lengthy calculations. In these cases, delegates and > notifications are good. I am not even sure that would be BCSequence > dealing with this, probably BCTools. > > So I am saying KVO should be enough for BCSequence. > > Now, just as a general discussion, I just want to emphasize the > limitations of KVO, actually more the PITA aspect of it. I am > fighting a lot with the Xgrid APIs (the fight is almost over, and I > think I am winning). The reason is there is no delegate (except in > 2 classes) or notifications. You are supposed to do everything with > KVO. And you have all sort of asynchronous stuff happening: every > action done on a job, a server, a grid, results,... is sent to the > server, and you have to 'observe' a different key and a different > object for all. The problem is: there is only one callback method, > which is: > - (void)observeValueForKeyPath:(NSString *)keyPath ofObject:(id) > object change:(NSDictionary *)change context:(void *)context > > So all the event end up calling that method, and you have to go > with a bunch of 'if' to test all the possibilities. Also, the > method tends to be called very often depending how the code was > written, and depending on what the bindings and UI is doing, which > you can't control. For instance, I see the array of jobs of a grid > go from empty array, then nil, then empty array, then half full, > then half empty, then all the jobs are here. > > To avoid all the 'if's, you could have a different class for each > event, each with their own code, but that is not feasible when you > have 20 different things happening, because then you need 20 > classes. Conversely, with the delegate design, each event is > dispatched to a different method, and the code is much easier to > write, debug and maintain. The code for the Xgrid classes that have > a delegate was a breeze to write and test. > > This is all to show you where the line between a delegate/ > notification design and a KVO-only design is, and what the user of > the framework will potentially face. > > Of course, the BIG advantage of KVO is bindings. And this is really > very cool. I am using it a lot now, together with Core Data, and it > is really fun. BTW, John, we will have to talk about Core Data at > some point ;-) > > charles > > > On Jul 9, 2005, at 11:36 AM, Alexander Griekspoor wrote: > > >> Wait! >> >> Ok, let me begin with Charles question what I'm talking about. The >> answer is yes, the sequence can have a delegate and still be a >> model yes. >> BUT, let me try to convince what I meant with my remark and how we >> can do it in a better way. >> First, the model-view-controller (MVC) design pattern. BCSequence >> is our model class, it contains the data. The idea behind this >> pattern is that the model never has any idea about the view its >> data is presented in, and is thus also not responsible for >> updates, data changes etc. This is all controlled by the >> controller which insulated both from each other and keeps both in >> sync. So if for instance the users edits the sequence in the gui, >> this should be picked up by the controller which makes sure the >> edit is forwarded to the model object. Thus, it would not make >> sense to have the model for instance being a delegate of the view, >> hence bypassing the controller. Also, why should the controller be >> the delegate of the model? The ONLY one changing the model is the >> controller, so if it does it should know already right! >> Now, I admit there are situations that this is not completely >> true. Imagine a alignment controller A which aligns several >> sequences and communicates this via an alignment view. Now the >> same sequences could potentially be altered by a different >> controller B which presents the user with a standard sequence >> editing window. Now of course we would like controller A to redo >> the alignment and update the view. According to the MVC pattern it >> would be controller B's task to initiate the update of A (via >> delegation or notification), so officially no problem and still no >> need for the model to contain notifications/delegations. Still, >> sometimes it would be handy to be able to monitor the changes made >> to a model object if multiple controllers potentially edit the >> same model object. >> Now what, notifications, delegations. I don't like the idea of >> sending "public" notifications into the air, and remember if we >> create or edit sequences a lot (which you do with model objects), >> this could create a lot of overhead, and could clog the >> notification system). Now delegates is more direct, but then >> again, you could only have one delegate, a severe limit for >> situations as described above!! In addition, in both case to make >> it work the code (most notable the accessors) will have to be >> polluted with delegation invocations (plus a check whether there >> actually is a delegate, plus a check whether the delegate responds >> to that call) or notification dispatches. Aaaaaahhh!!! >> Luckily there's something better, called Key-value-observing >> (KVO), allowing exactly what you guys are looking for, available >> in 10.3 or higher (now, did we all agree already that 10.3 is our >> lower limit). It's part of the famous "binding" system, and allows >> controllers to register themselves as observers for certain >> properties of the model object and get notifications when these >> changes so they can do their things. Sounds pretty good huh! How >> does it work? We just have to be key-value-coding (KVC) compliant, >> which we are already, hooray! NSObject provides automatic support! >> If we decide that at somepoints we need more control, there are >> additional methods that can be called to trigger observers like >> didChangeValueForKey. I strongly suggest to not go for delegates, >> not for notifications, but for KVO, which means we don't have to >> do anything, nice huh! Here's more info on KVO: >> http://developer.apple.com/documentation/Cocoa/Conceptual/ >> KeyValueObserving/KeyValueObserving.html >> >> Cheers, >> Alex >> >> >> >> >> >> On 9-jul-2005, at 8:28, Charles Parnot wrote: >> >> >>> On Jul 8, 2005, at 9:39 PM, Koen van der Drift wrote: >>> >>> >>> >>>> >>>> On Jul 9, 2005, at 12:19 AM, Charles Parnot wrote: >>>> >>>> >>>> >>>> >>>>>> Hmm, although certainly nice to have a link between a view and >>>>>> ultimately the sequence, this is the task of a controller >>>>>> class, not the model... The model shouldn't know/care of the >>>>>> view. It might be nice, if we add controllers to the framework >>>>>> that do this though.. >>>>>> Alex >>>>>> >>>>>> >>>>>> >>>>> >>>>> I am not sure what you both mean. Just to add to the confusion: >>>>> BCSequence can have a delegate and still be a model, no? >>>>> >>>>> >>>>> >>>>> >>>> >>>> Why not let BCSequence post notifications when something has >>>> changed? This way it is ot conected to a view, but and any >>>> object (including a controller or a view) can respond to it. >>>> >>>> >>>> - Koen. >>>> >>>> >>> >>> Yes, the standard in Cocoa seems to have both a delegate and >>> accompanying notifications, which the delegate always get without >>> subscribing. >>> >>> charles >>> >>> -- >>> Xgrid-at-Stanford >>> Help science move fast forward: >>> http://cmgm.stanford.edu/~cparnot/xgrid-stanford >>> >>> Charles Parnot >>> charles.parnot at gmail.com >>> >>> >>> >>> _______________________________________________ >>> Biococoa-dev mailing list >>> Biococoa-dev at bioinformatics.org >>> https://bioinformatics.org/mailman/listinfo/biococoa-dev >>> >>> >>> >> >> ********************************************************* >> ** Alexander Griekspoor ** >> ********************************************************* >> The Netherlands Cancer Institute >> Department of Tumorbiology (H4) >> Plesmanlaan 121, 1066 CX, Amsterdam >> Tel: + 31 20 - 512 2023 >> Fax: + 31 20 - 512 2029 >> AIM: mekentosj at mac.com >> E-mail: a.griekspoor at nki.nl >> Web: http://www.mekentosj.com >> >> Microsoft is not the answer, >> Microsoft is the question, >> NO is the answer >> >> ********************************************************* >> >> > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows is a 32-bit patch to a 16-bit shell for an 8-bit operating system, written for a 4-bit processor by a 2- bit company without 1 bit of sense. ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sun Jul 10 09:57:23 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 09:57:23 -0400 Subject: [Biococoa-dev] Moving on Message-ID: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> Hi, We have discussed quite some issues last week, and I think we should start thinking about putting this into the project. I will try to summarize some points. 1. Now that we more or less agree on the fact that we will use typed sequence, I want to bring up again the question whether we keep the current structure of sequence classes, or we go back to the regular superclass-subclass structure. If I remember correctly, the main reason to create the pseudo class-cluster was to make it possible that we only had to call BCSequence, instead of all the subclasses. However, as mentioned before, this appeared to be confusing since in some cases the subclasses are still being used. So we either stick to the current structure, and make really sure that we only use BCSequence, or we go back to the old structure. Charles put a lot of effort in creating the BCAbstractSequence/BCSequence classes, so I don't want to throw that out immediately. However, since we agree on using typed sequences, it seems more logical (and less confusing) to use the subclasses. 2. The main sequenceholder will be a char array that is private and can only be accessed through a NSData/NSMutable data wrapper. At this point it is still not clear how we will implement the immutable/mutable sequences, but I suggest to just start with the immutabe version, and take it from there. 3. I proposed to add a general object, BCStructureObject that will allow to add different types of classes, such as atom and residue. If there are no objections I can go ahead an add this as a superclass for BCSymbol and BCAbstractSequence. let me know what you guys think, and if I left anything out. cheers, - Koen. From biococoa at bioworxx.com Sat Jul 9 08:20:55 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Sat, 9 Jul 2005 14:20:55 +0200 Subject: [Biococoa-dev] Sequence Structure Message-ID: Hey, i looked arround for a solution of our sequence structure problem and found a very interesting design pattern called flyweight. It is something similar we did with our BCSymbol structure. Here is my suggestion: -------------- next part -------------- A non-text attachment was scrubbed... Name: BCSequenceStructure.jpg Type: image/jpeg Size: 113166 bytes Desc: not available URL: -------------- next part -------------- I think it's a very flexible structure. The handling with sequences, should to be done by tool-classes. So far my suggestion, please comment :-) Phil For more info about the flyweight pattern, see: http://www.dofactory.com/Patterns/PatternFlyweight.aspx http://www.codeproject.com/gen/design/testvalidators.asp .... (google: flyweight design pattern ;-)) From kvddrift at earthlink.net Sun Jul 10 16:17:32 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 16:17:32 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: Message-ID: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> On Jul 9, 2005, at 8:20 AM, Philipp Seibel wrote: > Hey, > > i looked arround for a solution of our sequence structure problem and > found a very interesting design pattern called flyweight. > It is something similar we did with our BCSymbol structure. Here is my > suggestion: > > > > I think it's a very flexible structure. The handling with sequences, > should to be done by tool-classes. > > So far my suggestion, please comment :-) > Hi Phil, interesting picture. What I don't see now are the BCSequence subclasses and the BCAbstractSequence class. Are those omitted for simplicity or really not present? Also I suggest to deprectate the use of BCSequenceType. IMO it is redundant because we already are using symbolsets and typed sequences. I don't understand the BCSymbolAnnotation part of the picture. Is this additional or replacing some of the current classes? cheers, - Koen. ps what program did you use to make the picture? From jtimmer at bellatlantic.net Sun Jul 10 18:28:57 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Sun, 10 Jul 2005 18:28:57 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <0AB7789D-CBC9-4FC1-B980-18E3C0EB2B54@gmail.com> Message-ID: So, if I'm going to start putting things together in terms of unsigned char's without breaking a lot of stuff, I'm going to have to do a lot at once, including at least creating the NSData ivar. I'll try to get started tonight, and just wanted to clear my plan with everyone: Change unichar methods in BCSymbol classes to use unsigned char's Change SymbolSet methods to handle unsigned char's Add an NSMutableData ivar to BCSequence hold the char array Modify BCSequence initialization methods to create the char array Add new methods to BCSequence to start working with the unsigned chars Since we're going to keep the accessor methods to the Symbol Arrays, most of the tools should cope nicely, but we might look them over briefly at this point to try to see if there are obvious optimizations to be made. Am I missing anything? Would people want me to commit with breakage in order to let them fix the mess I'm making? JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Sun Jul 10 18:44:03 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 18:44:03 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: On Jul 10, 2005, at 6:28 PM, John Timmer wrote: > Change unichar methods in BCSymbol classes to use unsigned char's > Change SymbolSet methods to handle unsigned char's > Add an NSMutableData ivar to BCSequence hold the char array > Modify BCSequence initialization methods to create the char array > Add new methods to BCSequence to start working with the unsigned chars > I think it should go into BCAbstractSequence, correct? Also why not use const char and NSData, since BCSequence is immutable? Other than that, go for it and we'll clean up the mess :) cheers, - Koen. From kvddrift at earthlink.net Sun Jul 10 19:25:51 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 19:25:51 -0400 Subject: [Biococoa-dev] biopython's project structure Message-ID: Hi, Just for reference, check out sections 11 and 13 on how the main biopython's classes are designed: (Now you know where I got the idea for separating BCSequence and BCSequenceRecord :) cheers, - Koen. From charles.parnot at gmail.com Sun Jul 10 19:40:19 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 16:40:19 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <66D98A24-20FB-43A0-A24A-1688FE85D153@gmail.com> On Jul 10, 2005, at 3:28 PM, John Timmer wrote: > So, if I'm going to start putting things together in terms of unsigned > char's without breaking a lot of stuff, I'm going to have to do a > lot at > once, including at least creating the NSData ivar. I'll try to get > started > tonight, and just wanted to clear my plan with everyone: > > Change unichar methods in BCSymbol classes to use unsigned char's > Change SymbolSet methods to handle unsigned char's > Add an NSMutableData ivar to BCSequence hold the char array > Modify BCSequence initialization methods to create the char array > Add new methods to BCSequence to start working with the unsigned chars You don't have to modify the code in BCSequence, as it will be dropped. I suppose you meant BCSequenceAbstract ;-) Just FYI: when I changed the code to simplify the init methods, I had to make a choice for a designated initializer. Of course, at the time, I chose the init with the symbol array... I suppose this could still be the designated initializer for the time being, and you could change code there. This is the ONLY place that code needs to be changed to support the char array in the init methods, as all the other init method depend on it (and in fact, there is just a few lines to change). Then all the tests for the init should still work, if you did a good job ;-) Let me know if the comments I put in the init methods helped. At some point (soon), we should then change the designated initializer. And maybe have that be a new initializer '...withData:', and remove the symbol array initializer. All the tests should still work and should not require changes, as they are all based on NSString. Then adding the other methods won't break anything as nothing exists yet... Thanks, John! charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sun Jul 10 19:41:46 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 16:41:46 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: On Jul 10, 2005, at 3:44 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 6:28 PM, John Timmer wrote: > > >> Change unichar methods in BCSymbol classes to use unsigned char's >> Change SymbolSet methods to handle unsigned char's >> Add an NSMutableData ivar to BCSequence hold the char array >> Modify BCSequence initialization methods to create the char array >> Add new methods to BCSequence to start working with the unsigned >> chars >> >> > > I think it should go into BCAbstractSequence, correct? Also why > not use const char and NSData, since BCSequence is immutable? > > Other than that, go for it and we'll clean up the mess :) > > cheers, > > - Koen. I know you mentioned it before, but why are you saying that BCSequence is immutable? So far, it has always been mutable, and if we are going to have just one option, it should be the mutable one. I believe many of the others agree? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From biococoa at bioworxx.com Sun Jul 10 19:42:52 2005 From: biococoa at bioworxx.com (Philipp Seibel) Date: Mon, 11 Jul 2005 01:42:52 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> Message-ID: Hi Koen, Am 10.07.2005 um 22:17 schrieb Koen van der Drift: > > On Jul 9, 2005, at 8:20 AM, Philipp Seibel wrote: > > >> Hey, >> >> i looked arround for a solution of our sequence structure problem >> and found a very interesting design pattern called flyweight. >> It is something similar we did with our BCSymbol structure. Here >> is my suggestion: >> >> >> >> I think it's a very flexible structure. The handling with >> sequences, should to be done by tool-classes. >> >> So far my suggestion, please comment :-) >> >> > > Hi Phil, interesting picture. What I don't see now are the > BCSequence subclasses and the BCAbstractSequence class. Are those > omitted for simplicity or really not present? Also I suggest to > deprectate the use of BCSequenceType. IMO it is redundant because > we already are using symbolsets and typed sequences. I suggest to use one single class to represent a sequence -> BCSequence. (BCAbstractSequence and others are really not present :-)). But it seems everybody except from me likes the oversized ( just my opinion ;-) ) inheritance model. > I don't understand the BCSymbolAnnotation part of the picture. Is > this additional or replacing some of the current classes? > This is to replace BCSymbol and BCSymbolSet classes, following the Flyweight-Pattern. It's hard to explain, perhaps you look at the links i sent. > cheers, > > - Koen. > > > ps what program did you use to make the picture? OmniGraffle 3.x cheers, Phil -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Sun Jul 10 19:49:04 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 19:49:04 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: On Jul 10, 2005, at 7:41 PM, Charles Parnot wrote: > I know you mentioned it before, but why are you saying that BCSequence > is immutable? Otherwise it would have been called BCMutableSequence? :) No, but seriuosly, my understanding was the the mutable sequence is going to be a subclass of the immutable sequence. And since we are still setting up the base classes it was my understanding that a sequence is immutable by default, because often you want to make sure that you are not chaning your data during a operation. cheers, - Koen. From kvddrift at earthlink.net Sun Jul 10 19:53:40 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 19:53:40 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> Message-ID: <94c346646b270e036d3262af545f1a93@earthlink.net> On Jul 10, 2005, at 7:42 PM, Philipp Seibel wrote: > I suggest to use one single class to represent a sequence -> > BCSequence. (BCAbstractSequence and others are really not present > :-)). > But it seems everybody except from me likes the oversized ( just my > opinion ;-) ) inheritance model. I am not in favor of them too, just check the archives for some nice discussions :) > >> I don't understand the BCSymbolAnnotation part of the picture. Is >> this additional or replacing some of the current classes? >> > > This is to replace BCSymbol and BCSymbolSet classes, following the > Flyweight-Pattern. It's hard to explain, perhaps you look at the links > i sent. > I would really strongly suggest not to drop the BCSymbolSet classes. They define the type of sequence we are dealing with and also function as a datafilter to make sure that only the correct symbols are added to a certain sequence. That being said, why do you put the word 'annotation' in these classes? I think that's what is confusing me, especially you meant them to be a replacement for BCSymbol. cheers, - Koen. From charles.parnot at gmail.com Sun Jul 10 19:54:57 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 16:54:57 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> Message-ID: <4E55C74D-D56A-4593-94EB-635803709A0A@gmail.com> > >> I don't understand the BCSymbolAnnotation part of the picture. Is >> this additional or replacing some of the current classes? >> > > This is to replace BCSymbol and BCSymbolSet classes, following the > Flyweight-Pattern. It's hard to explain, perhaps you look at the > links i sent. It seems that BCSymbolSet are more flexible that BCSymbolAnnotation, as there is only one BCSymbolAnnotation per sequence type, based on the factory method you show in your graph. Is that right? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sun Jul 10 19:59:58 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 16:59:58 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <3C42C9E7-BC57-4501-82A5-38EBC95C74BD@gmail.com> On Jul 10, 2005, at 4:49 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 7:41 PM, Charles Parnot wrote: > > >> I know you mentioned it before, but why are you saying that >> BCSequence is immutable? >> > > Otherwise it would have been called BCMutableSequence? :) > > No, but seriuosly, my understanding was the the mutable sequence is > going to be a subclass of the immutable sequence. And since we are > still setting up the base classes it was my understanding that a > sequence is immutable by default, because often you want to make > sure that you are not chaning your data during a operation. > > cheers, > > - Koen. I am completely and absolutely with you. But I am just being pragmatic right now, and I know most people on this list want mutable sequences now... One thing we could do is call all of these classes BCMutableSequence, BCMutableDNASequence,... and not use the name BCDNASequence,... But I think we will have mutable/immutable sequences real soon now. I want to work on that!! I feel exactly like you, that we need these immutable sequences to ensure better performance. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From jtimmer at bellatlantic.net Sun Jul 10 20:06:06 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Sun, 10 Jul 2005 20:06:06 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <3C42C9E7-BC57-4501-82A5-38EBC95C74BD@gmail.com> Message-ID: > I am completely and absolutely with you. But I am just being > pragmatic right now, and I know most people on this list want mutable > sequences now... One thing we could do is call all of these classes > BCMutableSequence, BCMutableDNASequence,... and not use the name > BCDNASequence,... But I think we will have mutable/immutable > sequences real soon now. I want to work on that!! I feel exactly like > you, that we need these immutable sequences to ensure better > performance. I agree with this in principle, but I'd like to do so in a way that the subclasses we've got in place work for both mutable and immutable sequences. I can't figure out how to possibly wrench the ObjC runtime into place so that happens, but I'm hoping one of you guys can come up with something. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Sun Jul 10 20:06:39 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 20:06:39 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <3C42C9E7-BC57-4501-82A5-38EBC95C74BD@gmail.com> References: <3C42C9E7-BC57-4501-82A5-38EBC95C74BD@gmail.com> Message-ID: <880c12a723aa222290459f76c50349c3@earthlink.net> On Jul 10, 2005, at 7:59 PM, Charles Parnot wrote: > I am completely and absolutely with you. But I am just being pragmatic > right now, and I know most people on this list want mutable sequences > now... One thing we could do is call all of these classes > BCMutableSequence, BCMutableDNASequence,... and not use the name > BCDNASequence,... But I think we will have mutable/immutable sequences > real soon now. I want to work on that!! I feel exactly like you, that > we need these immutable sequences to ensure better performance. Of course we should have mutable sequences, but just not by default. And remember, because we are using typed sequences, we need to maintain at least 8 classes, instead of just 2. - Koen. From charles.parnot at gmail.com Sun Jul 10 20:11:01 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:11:01 -0700 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> Message-ID: On Jul 10, 2005, at 1:16 AM, Alexander Griekspoor wrote: > ...Anyway, I would have no problem to make the public method > return an NSData but the private property be an NSMutableData, also > for the immutable sequence.... I think we should not do that. This is very dangerous. In most cases, it will go unnoticed, but in certain configurations, this is the path to very weird bugs in programs using the framework. Let me give you an example where that would be bad. The user implements a very rudimentary undo, simply by saving the contents of the sequence at each manipulation (let's say she does not know about the NSUndoManager!!). So every time the user changes something: NSData *currentSequence = [mySequence data]; [myMemoryStack addObject:currentSequence withKey:[NSDate date]]; and then later wants to go back to the sequence 2 hours ago: NSData *previousSequence = [myMemoryStack objectForKey:twoHoursAgoDate]; oldSequence = [BCSequence sequenceWithData:previousSequence]; Well, the data retrieved then will actually be the data corresponding to the sequence as of right now, because the pointer was actually to the actual data object in the sequence, which has been mutated since... You only return a mutable object instead of a non-mutable, when you just created it on the fly, do not share it with any other object, and you are (auto)releasing it anyway and are not going to modify it. Of course, for performance, you could still return the muutable data object, and do a lazy copy of the data if needed, later, when: * the data is asked by another object * the data is being modified (so you need some sort of flag for that). > But if there is a way to have the immutable bcsequence have an > NSData property and the mutable version have an NSMutableData then > that would be more elegant obviously... > Cheers, > Alex You can set an ivar to be NSData, but make it an NSMutableData anyway. With the proper casts, the compiler won't tell anything, and the runtime won't bother... As long as you don't call the wrong methods on the wrong type!! Another option to avoid the casts is to type the ivar as an 'id'. But then you don't get any compiler checking, of course,... charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sun Jul 10 20:21:32 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 20:21:32 -0400 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> Message-ID: On Jul 10, 2005, at 8:11 PM, Charles Parnot wrote: > >> But if there is a way to have the immutable bcsequence have an NSData >> property and the mutable version have an NSMutableData then that >> would be more elegant obviously... >> Cheers, >> Alex > > You can set an ivar to be NSData, but make it an NSMutableData anyway. > With the proper casts, the compiler won't tell anything, and the > runtime won't bother... As long as you don't call the wrong methods on > the wrong type!! That's seems very dangerous too! I prefer Alex's suggestion, by using NSData for the immutable sequence and NSMutableData for the mutable one. - Koen. From charles.parnot at gmail.com Sun Jul 10 20:21:41 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:21:41 -0700 Subject: [Biococoa-dev] Moving on In-Reply-To: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> Message-ID: <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> Sorry I should have replied to this earlier... The pace of the discussions is getting crazy... On Jul 10, 2005, at 6:57 AM, Koen van der Drift wrote: > Hi, > > We have discussed quite some issues last week, and I think we > should start thinking about putting this into the project. I will > try to summarize some points. > > 1. Now that we more or less agree on the fact that we will use > typed sequence, I want to bring up again the question whether we > keep the current structure of sequence classes, or we go back to > the regular superclass-subclass structure. If I remember correctly, > the main reason to create the pseudo class-cluster was to make it > possible that we only had to call BCSequence, instead of all the > subclasses. However, as mentioned before, this appeared to be > confusing since in some cases the subclasses are still being used. > > So we either stick to the current structure, and make really sure > that we only use BCSequence, or we go back to the old structure. > Charles put a lot of effort in creating the BCAbstractSequence/ > BCSequence classes, so I don't want to throw that out immediately. > However, since we agree on using typed sequences, it seems more > logical (and less confusing) to use the subclasses. Don't worry! I will repeat what I said before: there is very little code for the placeholder trick to work. Most of the code in BCSequence is to provide the automatic guess of the sequence type, which we will reuse in the superclas anyway. What I spent more time in was to clean the init methods, make it simpler and more consistent, and define one designated initializer, so that any modifications that affect all classes is in just one place. We will see what John has to say about that after he is done!! Anyway, like I said in another email, we will have to define a different dsignated initializer. > 2. The main sequenceholder will be a char array that is private and > can only be accessed through a NSData/NSMutable data wrapper. At > this point it is still not clear how we will implement the > immutable/mutable sequences, but I suggest to just start with the > immutabe version, and take it from there. Sorry I missed that earlier. We will have to find something... > > 3. I proposed to add a general object, BCStructureObject that will > allow to add different types of classes, such as atom and residue. > If there are no objections I can go ahead an add this as a > superclass for BCSymbol and BCAbstractSequence. How about BCMolecule instead. Well, I guess having BCAtom a subclass of BCMolecule is a bit weird... Never mind :-) Anyway, can you explain again the rationale to have a superclass for both BCSymbol and BCSequence? I know you already did explain it but I can't find the email. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sun Jul 10 20:26:53 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:26:53 -0700 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> Message-ID: <8C01EF67-F0F9-4736-9F26-34C905A6AB5E@gmail.com> On Jul 10, 2005, at 5:21 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 8:11 PM, Charles Parnot wrote: > > >> >> >>> But if there is a way to have the immutable bcsequence have an >>> NSData property and the mutable version have an NSMutableData >>> then that would be more elegant obviously... >>> Cheers, >>> Alex >>> >> >> You can set an ivar to be NSData, but make it an NSMutableData >> anyway. With the proper casts, the compiler won't tell anything, >> and the runtime won't bother... As long as you don't call the >> wrong methods on the wrong type!! >> > > That's seems very dangerous too! I prefer Alex's suggestion, by > using NSData for the immutable sequence and NSMutableData for the > mutable one. > > - Koen. Yes, but then you have two different ivars. Which means all the method you write for the immutable cannot be used for the mutable one (for instance, the init methods). Or at least, they can't use the ivar directly, but only a accessor that return either the mutable or the immutable ivar, cast as NSData. Which could work fine too, so maybe this is the solution. Do you see what I mean, I am not sure what I wrote is clear? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sun Jul 10 20:35:47 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:35:47 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <94c346646b270e036d3262af545f1a93@earthlink.net> References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> <94c346646b270e036d3262af545f1a93@earthlink.net> Message-ID: On Jul 10, 2005, at 4:53 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 7:42 PM, Philipp Seibel wrote: > > >> I suggest to use one single class to represent a sequence -> >> BCSequence. (BCAbstractSequence and others are really not >> present :-)). >> But it seems everybody except from me likes the oversized ( just >> my opinion ;-) ) inheritance model. >> > > I am not in favor of them too, just check the archives for some > nice discussions :) Phil, I am also more in favor of a single public class, but not for the same reason: for a simpler interface. However, I would not mind some subclasses hidden behind a class cluster design (I don't know if people knew about that idea? ;-)... which means I am OK with the inheritance model. In any case, there will and there is already a strong request for typed sequences, mostly for compile-time checking. And also, there is a strong willingness to choose between the two structures: either one- class-do-it-all or a tree of typed classes. Up until now, we had both structures in parallel, which was a way of not choosing. The consensus is now to choose, as everybody seem confused by the current design... well, except me ;-) And if we are going to choose, we should listen to those people asking for typed classes. And I can already see the first FAQ if we did not have typed sequence: "Q1: Where is the DNA sequence class? A: Eu... There ain't any. Q2: WTF?" So the bottom line is: let's go with the typed classes :-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Sun Jul 10 20:40:25 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:40:25 -0700 Subject: [Biococoa-dev] Moving on In-Reply-To: <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> Message-ID: <51482415-4200-45E2-ABA2-88CF72DB2A88@gmail.com> On Jul 10, 2005, at 5:35 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 8:21 PM, Charles Parnot wrote: > > >> Anyway, can you explain again the rationale to have a superclass >> for both BCSymbol and BCSequence? I know you already did explain >> it but I can't find the email. >> >> > > See here: > > http://bioinformatics.org/pipermail/biococoa-dev/2005-June/001355.html > > I have reattached the UML files too. > > cheers, > > - Koen. Thanks! Yes, this is what I rememebered. What I did not understand in your email today is why BCStructureObject was also the superclass for sequence. It make sense to have atomes, residues, symbol, functionalGroup,... be all in the same class tree. But the sequence does not seem to belong there. Is it just because of the name ivar? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sun Jul 10 20:41:21 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 20:41:21 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> Message-ID: <693e320cdd5653802651750a91ad7d54@earthlink.net> On Jul 10, 2005, at 8:21 PM, Charles Parnot wrote: > Don't worry! I will repeat what I said before: there is very little > code for the placeholder trick to work. Most of the code in BCSequence > is to provide the automatic guess of the sequence type, which we will > reuse in the superclas anyway. > > What I spent more time in was to clean the init methods, make it > simpler and more consistent, and define one designated initializer, so > that any modifications that affect all classes is in just one place. > We will see what John has to say about that after he is done!! Anyway, > like I said in another email, we will have to define a different > dsignated initializer. > If you can get that to work, that would be great. In that case it might be not so difficult to add a mutable/immutable version too. - Koen. From kvddrift at earthlink.net Sun Jul 10 20:43:15 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 20:43:15 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: <51482415-4200-45E2-ABA2-88CF72DB2A88@gmail.com> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> <51482415-4200-45E2-ABA2-88CF72DB2A88@gmail.com> Message-ID: <1d99d8a39ff466e53c973ebcbe3b3ffe@earthlink.net> On Jul 10, 2005, at 8:40 PM, Charles Parnot wrote: > But the sequence does not seem to belong there. Is it just because of > the name ivar? > For now yes. But also mass and other physical properties probably. cheers, - Koen. From charles.parnot at gmail.com Sun Jul 10 20:46:50 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:46:50 -0700 Subject: [Biococoa-dev] Moving on In-Reply-To: <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> Message-ID: Sorry, just though of something: how about BCMoiety instead of BCStructuralObject? http://www.foresight.org/Nanosystems/glossary/glossary_m.html can an atom be considered a moiety?? In any case, I would prefer BCStructuralElement, instead of 'object', as the word 'Object' has a strong meaning in OOO... Sorry for being so picky charles On Jul 10, 2005, at 5:35 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 8:21 PM, Charles Parnot wrote: > > >> Anyway, can you explain again the rationale to have a superclass >> for both BCSymbol and BCSequence? I know you already did explain >> it but I can't find the email. >> >> > > See here: > > http://bioinformatics.org/pipermail/biococoa-dev/2005-June/001355.html > > I have reattached the UML files too. > > cheers, > > - Koen. > > > > > > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sun Jul 10 20:49:09 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 20:49:09 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> Message-ID: <6af8a3c0e1dca921199097418d438078@earthlink.net> On Jul 10, 2005, at 8:46 PM, Charles Parnot wrote: > can an atom be considered a moiety?? I don't think so. > > In any case, I would prefer BCStructuralElement, instead of 'object', > as the word 'Object' has a strong meaning in OOO... Sorry for being so > picky > Good suggestion, I'll change that in the project. cheers, - Koen. From charles.parnot at gmail.com Sun Jul 10 20:50:48 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 17:50:48 -0700 Subject: [Biococoa-dev] Moving on In-Reply-To: <1d99d8a39ff466e53c973ebcbe3b3ffe@earthlink.net> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> <51482415-4200-45E2-ABA2-88CF72DB2A88@gmail.com> <1d99d8a39ff466e53c973ebcbe3b3ffe@earthlink.net> Message-ID: <0D68C988-6CC4-43C4-A83F-69A06FD46687@gmail.com> On Jul 10, 2005, at 5:43 PM, Koen van der Drift wrote: > > On Jul 10, 2005, at 8:40 PM, Charles Parnot wrote: > > >> But the sequence does not seem to belong there. Is it just because >> of the name ivar? >> >> > > For now yes. But also mass and other physical properties probably. > > cheers, > > - Koen. OK, so you really meant it ;-) I really think this is too much!! Grouping atoms, residues and symbols under a same superclass makes sense. But the BCSequence object is really a collection of such things, so it is really weird to have that under the same class tree. Also, I agree that it has the same physical properties, as it is a moelcule too. But for some reason, it seems you have some kind of mass spec bias. I don't know where that comes from ;-) I think haveing the sequence objects in the same class hierarchy would be confusing and overkill. What do the others think? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sun Jul 10 21:18:15 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 21:18:15 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: <0D68C988-6CC4-43C4-A83F-69A06FD46687@gmail.com> References: <7b78026584f740f1ff385d4cdf0f5728@earthlink.net> <47CBBC00-A727-40C8-9771-317DBE3599B0@gmail.com> <7cdaa7159d361c89a0ac517b1c985fe6@earthlink.net> <51482415-4200-45E2-ABA2-88CF72DB2A88@gmail.com> <1d99d8a39ff466e53c973ebcbe3b3ffe@earthlink.net> <0D68C988-6CC4-43C4-A83F-69A06FD46687@gmail.com> Message-ID: On Jul 10, 2005, at 8:50 PM, Charles Parnot wrote: > I really think this is too much!! Grouping atoms, residues and symbols > under a same superclass makes sense. But the BCSequence object is > really a collection of such things, so it is really weird to have that > under the same class tree. Also, I agree that it has the same physical > properties, as it is a moelcule too. But for some reason, it seems you > have some kind of mass spec bias. I don't know where that comes from > ;-) > > I think haveing the sequence objects in the same class hierarchy would > be confusing and overkill. What do the others think? It's all under the hood, and has no influence on the project. Compare it to NSObject which is the superclass of a wide variety of classes, that have nothing in common with each other. I have removed the original class (BCStructuralObject) and will wait with adding BCStructuralElement until we get some more opinions. - Koen. From jtimmer at bellatlantic.net Sun Jul 10 21:18:21 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Sun, 10 Jul 2005 21:18:21 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: <693e320cdd5653802651750a91ad7d54@earthlink.net> Message-ID: >> What I spent more time in was to clean the init methods, make it >> simpler and more consistent, and define one designated initializer, so >> that any modifications that affect all classes is in just one place. >> We will see what John has to say about that after he is done!! Anyway, >> like I said in another email, we will have to define a different >> dsignated initializer. >> > > If you can get that to work, that would be great. In that case it > might be not so difficult to add a mutable/immutable version too. Well, I'll tell you what I have to say before I'm done ;). I was thinking about this one the way home, and I like the idea of two different initializers, say initMutableSequence and initSequence. We could have our internal data ivar be untyped, and simply make it mutable data or immutable data in the init method, and set an isMutable BOOL ivar. All the work gets done in the init method, and we don't have to add any new classes. Does this make sense, or is this dangerous in some other way? JT PS - incidentally, how are we handling Xcode 2.1 and the .xcodeproj file format? _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Sun Jul 10 21:26:32 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 10 Jul 2005 21:26:32 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: References: Message-ID: <75052dccbfdb8c3ec923c7005488cf7d@earthlink.net> On Jul 10, 2005, at 9:18 PM, John Timmer wrote: > PS - incidentally, how are we handling Xcode 2.1 and the .xcodeproj > file format? Unless someone sends me a copy of Tiger and a DVD player, I am all for it :D - Koen. From charles.parnot at gmail.com Mon Jul 11 02:24:21 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 10 Jul 2005 23:24:21 -0700 Subject: [Biococoa-dev] Mutable classes implementation Message-ID: On Jul 10, 2005, at 6:18 PM, John Timmer wrote: > ...... I was thinking > about this one the way home, and I like the idea of two different > initializers, say initMutableSequence and initSequence. We could > have our > internal data ivar be untyped, and simply make it mutable data or > immutable > data in the init method, and set an isMutable BOOL ivar. All the > work gets > done in the init method, and we don't have to add any new classes. > > Does this make sense, or is this dangerous in some other way? > > JT > > PS - incidentally, how are we handling Xcode 2.1 and the .xcodeproj > file > format? > I completely agree with the internals. This is also how I would do it, and I think this is how NSArray was implemented. However, I would handle the public interface differently and model it after the Cocoa classes: * the init method can be the same; what determines the mutability is the calling class; polymorphism is good! * you need a different class to have compiler checking, and warn you when you call a mutability method on an immutable class... well, this the same argument as the -complement method for BCDNASequence and BCProteinSequence ;-) I was interrupted while writing this email, and I was thinking about it more while doing other stuff... Now I am back with some implementation ideas. The problem with having separate mutable classes is how to not duplicate code when you have two types of objects, and you can't have multiple inheritance (mutability and sequence type)... Here is how I would implement it * All the mutability methods are implemented in the superclass BCSequence (assuming we rename BCSequenceAbstract to BCSequence, which we should probably do, I suppose), like '-appendSequence:', '- removeSequenceAtRange:',... Maybe some of the mutability sequence need to be implemented at the level of the subclasses, for instance if we want to implement a '-digestWithKlenow' (this is just an example). This is fine, the implementation can be in BCDNASequence. However, all of these methods should start by checking the value of the isMutable ivar, and throw an exception if the object is not mutable. Also, these methods should not be declared in the headers of BCSequence, BCDNASequence,... because they are illegal for these classes. At this point, these methods are useless. They are not public, and if you call them, your program crash! * Now, of course, how do we get mutable objects?? First the public side of it. - We declare 5 more classes: BCMutableSequence (inherits from BCSequence) and then BCMutableProteinSequence, BCMutableNucleotideSequence, BCMutableDNASequence and BCMutableRNASequence (all 4 inheriting from BCMutableSequence) - The header for BCMutableSequence declares the method '- appendSequence:', '-removeSequenceAtRange:' - The header for BCMutableDNASequence declares '-complement', '- digestWithKlenow',... and the same thin for the other subclasses if any specific mutability methods exist - So, now, the user and the compiler think they have some mutable classes and they know about the methods that can be called - The methods declared in the superclass, like '-initWithString:' or '-reverse', also appear valid to the user and the compiler and can officially be called for mutable sequences; in particular, the '- initWithString:' is supposed to return a mutable instance - All methods specific for some of the sequence types, like '- complement' and 'hydrophobicity', are also declared in the headers of the mutable classes, so the user and the compiler can officically use them * So far, it looks like we have to write all these declared methods again for all these classes. But what happens internally at runtime??? Well, you should know me now. All the mutable classes are in fact... placeholder classes!! - The first important trick is: the designated initializer for the root class BCSequence has to include a 'isMutable:(BOOL)flag' argument (the method should probably be private, or at least not documented). This way, the init methods in the placeholder class can call it with isMutable:YES - Then the init methods are the only methods actually implemented in the BCMutableXXX classes, and they return instances of the BCXXXSequence. Of course, these instances have their isMutable flag set to YES, which is done by calling the designated initializer with mutable:YES. After being initialized, at runtime, these objects then behave as mutable, and can run the mutability methods without throwing exceptions. And then all the methods only need to be implemented in the BCXXXSequence subclasses, and not in the BCMutableXXX classes, which are just alive between alloc and init. This is a generalized version of the NSArray/NSMutableArray Apple's design. The init methods from both class always return instances of NSCFArray, with different values for the isMutable flags. It seems to be popular now to attach some OmniGraffle document, so here it is... It was too big for the limit on the mailing list, so I put it on the web: http://cmgm.stanford.edu/~cparnot/temp/mutable-sequences.png OK, bed time... have a nice week... charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From philipp.seibel at byteworxx.de Mon Jul 11 02:54:49 2005 From: philipp.seibel at byteworxx.de (Philipp Seibel) Date: Mon, 11 Jul 2005 08:54:49 +0200 Subject: Fwd: [Biococoa-dev] Sequence Structure References: <5666FEAC-E89E-4373-AA6F-16E1B3C722D6@bioworxx.com> Message-ID: <57BF3DC8-C896-4022-B2B3-C8B6AB0083C9@byteworxx.de> Anfang der weitergeleiteten E-Mail: > > Am 11.07.2005 um 01:54 schrieb Charles Parnot: > > >>> >>> >>> >>>> I don't understand the BCSymbolAnnotation part of the picture. >>>> Is this additional or replacing some of the current classes? >>>> >>>> >>>> >>> >>> This is to replace BCSymbol and BCSymbolSet classes, following >>> the Flyweight-Pattern. It's hard to explain, perhaps you look at >>> the links i sent. >>> >>> >> >> It seems that BCSymbolSet are more flexible that >> BCSymbolAnnotation, as there is only one BCSymbolAnnotation per >> sequence type, based on the factory method you show in your graph. >> Is that right? >> > > :-) i don't think so, because the BCSymbolAnnotation provides as > much functionality as the current BCSymbol and BCSymbolSet classes > and allows the user to add more info in the property dictionary for > symbols. Espacially this isn't possible with the BCSymbol class, > where currently all properties get accessor methods. > The flyweight pattern is just a suggestion, because it's build > exactly for our problem. > > cheers, > > Phil > > From mek at mekentosj.com Mon Jul 11 04:41:32 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 10:41:32 +0200 Subject: [Biococoa-dev] BCSequenceRecord In-Reply-To: References: <9ad571457e70e344ae2f9ab15c73247b@earthlink.net> <0629BCF6-4E51-4795-A24B-77363E915461@gmail.com> <086b453787e54ee033705ff6b92ce9ce@earthlink.net> <371E9B22-4F0F-467C-9C4B-B0D49C850D88@mekentosj.com> <2ad1189d58d8b8279c4d9bbd02cd9533@earthlink.net> <3423218C-1AED-46E6-B83F-BCBAA0271F49@mekentosj.com> Message-ID: <06408EB4-2C89-4D35-A759-E146B1D2D983@mekentosj.com> I totally miss the point, my proposal was to have an header like this for the immutable class: NSMutableData *seqdata; - (NSData *)data; the implementation: - (NSData *)data { return seqdata; } I don't see how the underlying data could change because there are no public methods to edit the data. The compiler will beep when you do try to edit the mutabledata object outside the sequence object because the method tells you it's immutable (which it isn't in reality of course). The mutable class overrides the -data method to: - (NSMutableData *)data { return seqdata; } But perhaps there are way better approaches. I think our biggest problem is that we have kind of 2d inheritance we would like, both in the direction of mutable vs immutable, and in the direction of DNA, RNA, Protein, etc. What's the best approach to do that? This would be absolutely the biggest argument in favor of the single sequence does it all method (untyped sequence, typed by the symbolset). Cheers, Alex On 11-jul-2005, at 2:11, Charles Parnot wrote: > > On Jul 10, 2005, at 1:16 AM, Alexander Griekspoor wrote: > >> ...Anyway, I would have no problem to make the public method >> return an NSData but the private property be an NSMutableData, >> also for the immutable sequence.... >> > > I think we should not do that. This is very dangerous. In most > cases, it will go unnoticed, but in certain configurations, this is > the path to very weird bugs in programs using the framework. > > Let me give you an example where that would be bad. The user > implements a very rudimentary undo, simply by saving the contents > of the sequence at each manipulation (let's say she does not know > about the NSUndoManager!!). > > So every time the user changes something: > NSData *currentSequence = [mySequence data]; > [myMemoryStack addObject:currentSequence withKey:[NSDate date]]; > > > and then later wants to go back to the sequence 2 hours ago: > NSData *previousSequence = [myMemoryStack > objectForKey:twoHoursAgoDate]; > oldSequence = [BCSequence sequenceWithData:previousSequence]; > > Well, the data retrieved then will actually be the data > corresponding to the sequence as of right now, because the pointer > was actually to the actual data object in the sequence, which has > been mutated since... > > You only return a mutable object instead of a non-mutable, when you > just created it on the fly, do not share it with any other object, > and you are (auto)releasing it anyway and are not going to modify it. > > Of course, for performance, you could still return the muutable > data object, and do a lazy copy of the data if needed, later, when: > * the data is asked by another object > * the data is being modified > (so you need some sort of flag for that). > > > > >> But if there is a way to have the immutable bcsequence have an >> NSData property and the mutable version have an NSMutableData then >> that would be more elegant obviously... >> Cheers, >> Alex >> > > You can set an ivar to be NSData, but make it an NSMutableData > anyway. With the proper casts, the compiler won't tell anything, > and the runtime won't bother... As long as you don't call the wrong > methods on the wrong type!! > > Another option to avoid the casts is to type the ivar as an 'id'. > But then you don't get any compiler checking, of course,... > > charles > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Mon Jul 11 04:58:23 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 10:58:23 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> <94c346646b270e036d3262af545f1a93@earthlink.net> Message-ID: <1D878B40-EF2E-4A3D-842A-4AB4973D5598@mekentosj.com> > And if we are going to choose, we should listen to those people > asking for typed classes. Well, now I heard Koen (of course ;-) and Phil, but also Charles, and myself already say that we like the idea of a single sequence. I'm in the camp of Charles, mainly the simpler interface and to circumvent the double inheritance problem. I'm strongly in favor of a "typing" the sequence by symbolsets to check for errors and having tests based on this property to return proper values for each method... Maybe wrong, but it must be possible to do all check we need during runtime based on the sequenceset associated with a sequence: in pseudocode given BCSequence *seq; Say you want a translation but you only want to call it on a protein, then in the code you could: - first ask the [seq symbolset] - and only if it is equal to a protein, call the method, otherwise don't. You can even do menu item validation based on the symbolset of the current sequence, easily! Now return values, say you have the complement method, which only makes sense on DNA. then the method would be in pseudocode: - (BCSequence *) complement { symbolset = [self symbolset] if symbolset == dnasymbolset { // do the work return a fresh BCSequence; } else { return nil; } } > Phil, I am also more in favor of a single public class, but not for > the same reason: for a simpler interface. and > So the bottom line is: let's go with the typed classes :-) Now, that is some serious form of schizophrenia ;-) I would say, unless John can't find himself in the "typing by symbolset" idea, the bottom line should be: let's go with a single bcsequence (or two actually, a mutable and immutable form). Switching to the single sequence system will require a more radical rewrite though I guess... Finally the FAQ: > And I can already see the first FAQ if we did not have typed sequence: > "Q1: Where is the DNA sequence class? > A: Eu... There ain't any. Wrong, a DNA sequence class is a BCSequence with the BCSymbolSetDNA associated to it. Cheers, Alex On 11-jul-2005, at 2:35, Charles Parnot wrote: > > On Jul 10, 2005, at 4:53 PM, Koen van der Drift wrote: > > >> >> On Jul 10, 2005, at 7:42 PM, Philipp Seibel wrote: >> >> >> >>> I suggest to use one single class to represent a sequence -> >>> BCSequence. (BCAbstractSequence and others are really not >>> present :-)). >>> But it seems everybody except from me likes the oversized ( just >>> my opinion ;-) ) inheritance model. >>> >>> >> >> I am not in favor of them too, just check the archives for some >> nice discussions :) >> > > Phil, I am also more in favor of a single public class, but not for > the same reason: for a simpler interface. However, I would not mind > some subclasses hidden behind a class cluster design (I don't know > if people knew about that idea? ;-)... which means I am OK with the > inheritance model. > > In any case, there will and there is already a strong request for > typed sequences, mostly for compile-time checking. And also, there > is a strong willingness to choose between the two structures: > either one-class-do-it-all or a tree of typed classes. Up until > now, we had both structures in parallel, which was a way of not > choosing. The consensus is now to choose, as everybody seem > confused by the current design... well, except me ;-) > > And if we are going to choose, we should listen to those people > asking for typed classes. And I can already see the first FAQ if we > did not have typed sequence: > "Q1: Where is the DNA sequence class? > A: Eu... There ain't any. > Q2: WTF?" > > So the bottom line is: let's go with the typed classes :-) > > charles > > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From a.griekspoor at nki.nl Mon Jul 11 04:59:55 2005 From: a.griekspoor at nki.nl (Alexander Griekspoor) Date: Mon, 11 Jul 2005 10:59:55 +0200 Subject: [Biococoa-dev] Moving on References: Message-ID: I like BCStructuralElement, it certainly is less phague than moiety... Alex On 11-jul-2005, at 2:46, Charles Parnot wrote: > Sorry, just though of something: > how about BCMoiety instead of BCStructuralObject? > > http://www.foresight.org/Nanosystems/glossary/glossary_m.html > > can an atom be considered a moiety?? > > In any case, I would prefer BCStructuralElement, instead of > 'object', as the word 'Object' has a strong meaning in OOO... Sorry > for being so picky > > charles > > > On Jul 10, 2005, at 5:35 PM, Koen van der Drift wrote: > > >> >> On Jul 10, 2005, at 8:21 PM, Charles Parnot wrote: >> >> >> >>> Anyway, can you explain again the rationale to have a superclass >>> for both BCSymbol and BCSequence? I know you already did explain >>> it but I can't find the email. >>> >>> >>> >> >> See here: >> >> http://bioinformatics.org/pipermail/biococoa-dev/2005-June/ >> 001355.html >> >> I have reattached the UML files too. >> >> cheers, >> >> - Koen. >> >> >> >> >> >> >> > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ************************************************************** ** Alexander Griekspoor ** ************************************************************** The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com MacOS X: The power of UNIX with the simplicity of the Mac *************************************************************** ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com LabAssistant - Get your life organized! http://www.mekentosj.com/labassistant ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Mon Jul 11 07:51:49 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 11 Jul 2005 07:51:49 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <1D878B40-EF2E-4A3D-842A-4AB4973D5598@mekentosj.com> References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> <94c346646b270e036d3262af545f1a93@earthlink.net> <1D878B40-EF2E-4A3D-842A-4AB4973D5598@mekentosj.com> Message-ID: On Jul 11, 2005, at 4:58 AM, Alexander Griekspoor wrote: > let's go with a single bcsequence (or two actually, a mutable and > immutable form).? Hear, hear! This day is starting pretty good :) - Koen. From kvddrift at earthlink.net Mon Jul 11 07:53:45 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 11 Jul 2005 07:53:45 -0400 Subject: [Biococoa-dev] Moving on In-Reply-To: References: Message-ID: On Jul 11, 2005, at 4:59 AM, Alexander Griekspoor wrote: > > > I like BCStructuralElement, it certainly is less phague than moiety... phague ?? Have you been too much in the lab, recently? ;-) - Koen. From jtimmer at bellatlantic.net Mon Jul 11 08:44:57 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 11 Jul 2005 08:44:57 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: Message-ID: > > On Jul 11, 2005, at 4:58 AM, Alexander Griekspoor wrote: > >> let's go with a single bcsequence (or two actually, a mutable and >> immutable form).? > > Hear, hear! > > This day is starting pretty good :) Well, it's starting very poorly for me. If I'm the only one arguing the alternative, then I the majority should rule. Unfortunately, that makes the resulting framework a poor choice for my work/coding style, so I don't expect I will use it for much more than sequence format translations and perhaps alignments. Since I won't be using the main classes, I doubt I'll contribute much to their coding in the future. The big question I have is this: several of the classes as they now exist would provide a very good foundation for some of the things I would like to do in the future. I like a lot of the code there, since I wrote it and then had the rest of you point out where I made mistakes, so it's probably some of the best code I've been involved with ;). I'd like to use it in future work, so I'd like to hang on to the classes. I would also expect that I'll be modifying these classes for my own uses in the future. I do not want to create a complete LGPL fork. Is there any good way of handling this? I can't even just decide to start from scratch, since I'd probably re-write a lot of it in exactly the same way. Any advice on how to handle this would be greatly appreciated. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Mon Jul 11 09:25:51 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 11 Jul 2005 09:25:51 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: Message-ID: <0fc3315e2cedbdb717493879a9e15fab@earthlink.net> On Jul 11, 2005, at 8:44 AM, John Timmer wrote: > Well, it's starting very poorly for me. If I'm the only one arguing > the > alternative, then I the majority should rule. Unfortunately, that > makes the > resulting framework a poor choice for my work/coding style, so I don't > expect I will use it for much more than sequence format translations > and > perhaps alignments. Since I won't be using the main classes, I doubt > I'll > contribute much to their coding in the future. Well, that's not good news at all :( We should really try to find a solution for this situation. You have contributed quite some code and it would be a shame to see you leave because of this. I guess in your own project that uses BioCocoa you could always create your own subclasses of BCSequence to suit your needs, but that's a half baked solution, I admit. My understanding is that the main reason you prefer the typed sequences, is that you can avoid sending the wrong type of sequence to the wrong operation, is that correct? Because this is unavoidable for untyped sequences, we should do our best to find a solution for this. Alex's suggestion, using the symbolset, seems a good step forward. On the other hand, untyped sequences make implementing the immutable/mutable classes much easier. John, I am sure I speak for the others too that I would hate to see you leave, so again, we should try all our efforts to come to a good way out of this. cheers, - Koen. From mek at mekentosj.com Mon Jul 11 10:42:09 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 16:42:09 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <0fc3315e2cedbdb717493879a9e15fab@earthlink.net> References: <0fc3315e2cedbdb717493879a9e15fab@earthlink.net> Message-ID: > John, I am sure I speak for the others too that I would hate to see > you leave, so again, we should try all our efforts to come to a > good way out of this. Yes, absolutely!! It would be really a pity to see you leave John. You have made absolutely a major contribution to the project, and it would be a shame to see a person leave with your experience and skills. I hope you decide to take the gamble, perhaps wait a bit till we can either see whether or not we are heading in the right direction, and jump back in. Already having the opinions and critical eye of someone who is sceptical about our approach would be very helpful! Again, I have been always in favour of first of all only 1 approach, either typed or untyped bcsequences, and I have always been in favour of typed sequences. I still would like to see only one approach, but recently I feel more and more attracted to the untyped sequences approach for several reasons: - a uniform and simple interface for new developers (and ourselves). Trying (and failing) to explain our approach even to myself on the WWDC told me we were doing something wrong - we have a way of typing in my opinion, the symbolsets, I'm convinced (at least hope) that this should be able to give enough typing and error checking, also during runtime (by explicit testing ourselves). - it largely reduces the number of subclasses to only 2 (mutable vs immutable), and solves a problem I haven't seen a decent alternative for, the 2d-inheritance problem. Perhaps I was most convinced by the appearance of more and more omni- graffle schemes. I couldn't understand a single one instantly, way to complex, way to many arrows, code-duplication all over, methods spread all over, difficult initializers, tricks, etc. For me they made only one thing really clear, not the structure we should pick, but the structure we should not pick. Again, it would be a real pity to leave, if you want to use some of the classes like they are now in your own projects I would say go ahead, we'll work the license out later. We can always release the current "snapshot" under a bsd license. But I hope your needs are not immediate and that you are willing to hang around and see how things develop. I hope it will change your mind later, and that the framework can be of great use. In the mean time your advice would be indispensable! Cheers, Alex On 11-jul-2005, at 15:25, Koen van der Drift wrote: > On Jul 11, 2005, at 8:44 AM, John Timmer wrote: > >> Well, it's starting very poorly for me. If I'm the only one >> arguing the >> alternative, then I the majority should rule. Unfortunately, that >> makes the >> resulting framework a poor choice for my work/coding style, so I >> don't >> expect I will use it for much more than sequence format >> translations and >> perhaps alignments. Since I won't be using the main classes, I >> doubt I'll >> contribute much to their coding in the future. >> > > Well, that's not good news at all :( > > We should really try to find a solution for this situation. You > have contributed quite some code and it would be a shame to see you > leave because of this. I guess in your own project that uses > BioCocoa you could always create your own subclasses of BCSequence > to suit your needs, but that's a half baked solution, I admit. > > My understanding is that the main reason you prefer the typed > sequences, is that you can avoid sending the wrong type of sequence > to the wrong operation, is that correct? Because this is > unavoidable for untyped sequences, we should do our best to find a > solution for this. Alex's suggestion, using the symbolset, seems a > good step forward. On the other hand, untyped sequences make > implementing the immutable/mutable classes much easier. > > John, I am sure I speak for the others too that I would hate to see > you leave, so again, we should try all our efforts to come to a > good way out of this. > > cheers, > > - Koen. > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Microsoft is not the answer, Microsoft is the question, NO is the answer ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtimmer at bellatlantic.net Mon Jul 11 11:04:24 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 11 Jul 2005 11:04:24 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <0fc3315e2cedbdb717493879a9e15fab@earthlink.net> Message-ID: > My understanding is that the main reason you prefer the typed > sequences, is that you can avoid sending the wrong type of sequence to > the wrong operation, is that correct? Because this is unavoidable for > untyped sequences, we should do our best to find a solution for this. > Alex's suggestion, using the symbolset, seems a good step forward. On > the other hand, untyped sequences make implementing the > immutable/mutable classes much easier. Well, it's not just that. Untyped sequences make the sequences stupid - they don't know what they are, they don't know what operations they can perform, and they have to have defined responses to cope with requests for operations they can't perform. Essential methods get scattered in a ton of other classes (mostly tools, but there's going to be a lot of tools in the end) - you need to call through to a separate class just to figure out what type of sequence you have, and something as simple as complementing a nucleotide sequence requires the creation of a new object and may call through about 4 different methods there (which is what I hate about BioJava, as we've discussed). And yes, I hate having to test the sequence type every time I think about doing anything with the sequence. If I create a nucleotide sequence, I want it to act like one. > John, I am sure I speak for the others too that I would hate to see you > leave, so again, we should try all our efforts to come to a good way > out of this. I seem to be doing my best thinking on the subway these days - on the commute in, I thought about how to possibly handle this, and here's a potential solution: We do create a lightweight, high performance sequence object that's untyped. Basically, it acts as a specialized NSArray for sequences. The tools focus on working with this object, since they will be performing the processor intensive operations, and this is designed for performance. I rework the existing sequence subclasses to be holders for this. Convenience calls through to the tools put a "smart" interface on the otherwise stupid sequence object. This is not ideal, as it creates a lot more call-throughs to another class. That's not such a problem, though, as most of those call-throughs would have gone to NSArray or tool classes in the current structure anyway. It also creates design decisions - when a file is read, does it create a sequence or a typed sequence holder? Should we create methods to do both? What about annotated sequences - should they hold one or both types of sequence objects? Fortunately, the option of creating the appropriate type of sequence object on the fly should let us keep both around, as needed. Regardless, in the end, I'm not threatening to no longer contribute - my focus would just change based on which things I would find most useful. The sequence classes would no longer be useful, but I'm certain there are still things in the framework that I'd like to use in future projects. Cheers, JT _______________________________________________ This mind intentionally left blank From mek at mekentosj.com Mon Jul 11 11:57:47 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 17:57:47 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: Message-ID: <465F2842-8D91-4B2F-AAB4-228080D8A182@mekentosj.com> On 11-jul-2005, at 17:04, John Timmer wrote: >> [snip] > > Well, it's not just that. Untyped sequences make the sequences > stupid - > they don't know what they are, they don't know what operations they > can > perform, and they have to have defined responses to cope with > requests for > operations they can't perform. They DO know who they are, their symbolset tells them, and they adapt their behaviour based on this. > Essential methods get scattered in a ton of other classes (mostly > tools, but there's going to be a lot of tools in the > end) - you need to call through to a separate class just to figure > out what > type of sequence you have, and something as simple as complementing a > nucleotide sequence requires the creation of a new object and may call > through about 4 different methods there (which is what I hate about > BioJava, > as we've discussed). And yes, I hate having to test the sequence > type every > time I think about doing anything with the sequence. Well, you don't have to because we do that inside the bcsequence object, you can call complement on a sequence and it will do the check if given the assigned symbolset it would make sense to return a value, nil or throw an exception. you don't have to do anything. > If I create a nucleotide sequence, I want it to act like one. It will because you have created a nucleotide sequence the moment you assign a nucleotide symbolset to the sequence. and it will behave as one. > >> John, I am sure I speak for the others too that I would hate to >> see you >> leave, so again, we should try all our efforts to come to a good way >> out of this. >> > > I seem to be doing my best thinking on the subway these days - on the > commute in, I thought about how to possibly handle this, and here's a > potential solution: > > We do create a lightweight, high performance sequence object that's > untyped. > Basically, it acts as a specialized NSArray for sequences. The > tools focus > on working with this object, since they will be performing the > processor > intensive operations, and this is designed for performance. I > rework the > existing sequence subclasses to be holders for this. Convenience > calls > through to the tools put a "smart" interface on the otherwise stupid > sequence object. > > This is not ideal, as it creates a lot more call-throughs to > another class. > That's not such a problem, though, as most of those call-throughs > would have > gone to NSArray or tool classes in the current structure anyway. > It also > creates design decisions - when a file is read, does it create a > sequence or > a typed sequence holder? Should we create methods to do both? > What about > annotated sequences - should they hold one or both types of sequence > objects? Fortunately, the option of creating the appropriate type of > sequence object on the fly should let us keep both around, as needed. This is an interesting solution, indeed a kind of custom NSArray for sequences would make sense, and would be a solution, but as you said this will certainly require a tool for everything right? And how does it handle the mutable vs immutable PLUS different subclasses for protein, dna etc? > > > Regardless, in the end, I'm not threatening to no longer contribute > - my > focus would just change based on which things I would find most > useful. The > sequence classes would no longer be useful, but I'm certain there > are still > things in the framework that I'd like to use in future projects. That's great news John, thanks! Cheers, Alex > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.parnot at gmail.com Mon Jul 11 12:17:27 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 11 Jul 2005 09:17:27 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <1D878B40-EF2E-4A3D-842A-4AB4973D5598@mekentosj.com> References: <56b8190e2668249b23d0889ae82d16a2@earthlink.net> <94c346646b270e036d3262af545f1a93@earthlink.net> <1D878B40-EF2E-4A3D-842A-4AB4973D5598@mekentosj.com> Message-ID: On Jul 11, 2005, at 1:58 AM, Alexander Griekspoor wrote: > >> Phil, I am also more in favor of a single public class, but not >> for the same reason: for a simpler interface. > and >> So the bottom line is: let's go with the typed classes :-) > Now, that is some serious form of schizophrenia ;-) Well, yes, I guess compromise is a mild form of schiozophrenia ;-) > I would say, unless John can't find himself in the "typing by > symbolset" idea, the bottom line should be: let's go with a single > bcsequence (or two actually, a mutable and immutable form). > Switching to the single sequence system will require a more radical > rewrite though I guess... I am not the only schizophrenic on this list :-) charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 11 12:22:40 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 11 Jul 2005 09:22:40 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <465F2842-8D91-4B2F-AAB4-228080D8A182@mekentosj.com> References: <465F2842-8D91-4B2F-AAB4-228080D8A182@mekentosj.com> Message-ID: <333BF587-B616-412F-90CD-A6687057ED3C@gmail.com> On Jul 11, 2005, at 8:57 AM, Alexander Griekspoor wrote: >> >> And yes, I hate having to test the sequence type every >> time I think about doing anything with the sequence. > Well, you don't have to because we do that inside the bcsequence > object, you can call complement on a sequence and it will do the > check if given the assigned symbolset it would make sense to return > a value, nil or throw an exception. you don't have to do anything. Just for the record: I don't think we should throw an exception, that would kill the whole design. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Mon Jul 11 14:08:05 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 11 Jul 2005 11:08:05 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: Message-ID: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> > I seem to be doing my best thinking on the subway these days - on the > commute in, I thought about how to possibly handle this, and here's a > potential solution: > > We do create a lightweight, high performance sequence object that's > untyped. > Basically, it acts as a specialized NSArray for sequences. The > tools focus > on working with this object, since they will be performing the > processor > intensive operations, and this is designed for performance. I > rework the > existing sequence subclasses to be holders for this. Convenience > calls > through to the tools put a "smart" interface on the otherwise stupid > sequence object. > > This is not ideal, as it creates a lot more call-throughs to > another class. > That's not such a problem, though, as most of those call-throughs > would have > gone to NSArray or tool classes in the current structure anyway. > It also > creates design decisions - when a file is read, does it create a > sequence or > a typed sequence holder? Should we create methods to do both? > What about > annotated sequences - should they hold one or both types of sequence > objects? Fortunately, the option of creating the appropriate type of > sequence object on the fly should let us keep both around, as needed. Lately, the consensus was that we should not have both untyped and typed sequence classes at the same time, because it is confusing for the user and even the developers of the framework. I personally don't think it would be that confusing if things are clearly explained and/ or exposed at different levels. For instance, typed sequences could be for "the experts". Kind of like CFArray and NSArray. BTW, which are toll-free bridged. Then there are different ways to implement this. The structure that we have now is one. What you are proposing is another, and might be easier to understand at least from the BioCocoa developer point of view. The important thing is that the two worlds (typed and untyped) are separate from the user and compiler perspective, BUT avoid code duplication in the implementation. This is a hard challenge. The only way to do it is indeed to either wrap one of the object inside the other like you propose, or use the placeholder trick I set up in the current design; in other word, one of the object is the "real" one, the "master" implementation, and the other is just using it and putting a fake interface in front of it. So in the end, the public interface look like there are 2 different kind of objects. But internally, there is really only one, so that any change in the implementation of the 'master' object is automatically used by the other one. To come back to your proposition, it is symmetric to the current implementation. Currently, the typed classes are the "real" objects, while the BCSequence is just a placeholder and internally generates these typed objects. You are proposing the opposite. The one-for-all unique BCSequence class is where implementation is, and the typed classes would just be wrappers around instances of it. I do think the concept would be easier to understand than the way it is now. Maybe we could work it out to be like an "extension" not included in the BCFoundation header, but just as an additional header (the binary could still be part of BCFoundation, so the user would not have to link against an additional framework, but simply to #import an additional header, only for the compiler benefit). This header would declare the following classes: BCTypedSequence (root, inherits from NSObject), BCDNASequence, BCRNASequence,... It is important that these classes are not subclasses of BCSequence, because type-specific methods such as '-complement' are declared in the BCSequence header and will be recognized as valid for all subclasses. And you don't want the compiler to think that BCProteinSequence can respond to the message. This would work with the wrapper design your propose. I see 2 problems with the wrapper design, though: * you add an additional layer to the call stack, which you mention; in most cases, it should be OK and won't have much effect of performance; but it is still there * more problematic is that for every method of BCSequence, such as '- complement', '-reverse', 'subsequence',... you need to write a method for the wrapper that call the BCSequence method. This is a lot of code. One way around it is to use the -forward trick, but that adds a lot of overhead and may not be that easy to set up (we could certainly consider it, though). Rather than a wrapper, I propose we use the placeholder trick ;-) All you have to write are the init methods, and return a BCSequence object from these. All the code can be in the superclass 'BCTypedSequence', and the subclasses BCDNASequence,... are just empty shells, only there for the headers (actually, they might just need a trivial '-sequenceType' method that the superclass can call to do the right init). So, the instance returned by the init methods would in fact be a BCSequence object, ready to respond to all the methods implemented there. And it would respond to any method we add in the future without additional code (we would just have to keep the header in sync). Of course, one thing you can't do this way is to throw an exception when you call the wrong method on the wrong type of sequence, like calling '-complement' on a protein. But you get a compiler warning, which is the most important part. If you ignore it, you only get what you deserve if your app has a weird behavior!! Let me add the mandatory OmniGraffle thingie: http://cmgm.stanford.edu/~cparnot/temp/typed-sequences.png What do you think? charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Mon Jul 11 15:42:53 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 11 Jul 2005 15:42:53 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <465F2842-8D91-4B2F-AAB4-228080D8A182@mekentosj.com> References: <465F2842-8D91-4B2F-AAB4-228080D8A182@mekentosj.com> Message-ID: <2a67546c3d1a479af795ba0e4be9b0f2@earthlink.net> On Jul 11, 2005, at 11:57 AM, Alexander Griekspoor wrote: > On 11-jul-2005, at 17:04, John Timmer wrote: > >>> [snip] >> >> Well, it's not just that. Untyped sequences make the sequences stupid >> - >> they don't know what they are, they don't know what operations they >> can >> perform, and they have to have defined responses to cope with >> requests for >> operations they can't perform. > They DO know who they are, their symbolset tells them, and they adapt > their behaviour based on this. Even better, the symbolsets specifiy what type of DNA you are dealing with, ambiguous/unambiguous, or even a user-defined type of DNA. > >> Essential methods get scattered in a ton of other classes (mostly >> tools, but there's going to be a lot of tools in the >> end) - you need to call through to a separate class just to figure >> out what >> type of sequence you have, and something as simple as complementing >> anucleotide sequence requires the creation of a new object and may >> call >> through about 4 different methods there (which is what I hate about >> BioJava, >> as we've discussed). And yes, I hate having to test the sequence type >> every >> time I think about doing anything with the sequence. > Well, you don't have to because we do that inside the bcsequence > object, you can call complement on a sequence and it will do the check > if given the assigned symbolset it would make sense to return a value, > nil or throw an exception. you don't have to do anything. > >> If I create a nucleotide sequence, I want it to act like one. It will, because it is made up of nucleotides, which are specified in the symbolset. Just naming the class BCSequenceNucleotide doesn't make it a nucleotide sequence. It are the individual BCSymbols that make it a nucleotide sequence. - Koen. From mek at mekentosj.com Mon Jul 11 16:04:48 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 22:04:48 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> Message-ID: <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> Hi everbody, Charles and I just had a little chat about the mutable vs immutable discussion we have had on the list. Perhaps it's nice to copy it here for everyone to read. Cheers, Alex AIM IM with Charles Parnot . 21:10 [snip] -ok, let's get started on the NSMUtableData? -yes -lots of discussion and kind of stuck right -So what I mean is: you return an NSMutableData, the compiler sees an NSData, so you can't modify it.. This is good, but... -yes, that's what I had in mind -but the BCSequence might modify the object later -how? -that was my point -the immutable subclass should alter the data -none of its methods should -sorry should not -The mutable class can modify the ivar, right? -can modify the content of the ivar, I should say -yes, the mutable class -but the mutable class should also return a mutable data object so the muutable class returns the poiinter to that ivar, right -yes -OK, what if the mutable class returns an NSData (from the header) the immutable class as well, only casted to NSData can't we override the method to be typed as mutabledata? -Is there a '-(NSData *)data' method in the mutable class header? -that was the only question I had 21:15 -the question was whether you could override: '-(NSData *)data with '-(NSMutableData *)data -I'm not sure -no you can't -never tried -I tried -then we have a problem -you get a compiler warning -you have to do '-(id)data' -yep -if you look at the NSArray/NSMuutableArray headers,... -you have '+(id)array' -aha -this is because you cant' have +(NSArray *)array -and +(NSMutableArray)array -i get the idea -We should have '-(NSData *)data' for both mut/immut and... -add the method '-(NSMUtableData *)mutableData' to the mutable one -or not -hmm, yes and no -in any case, the object returned by '-data' should be immutable and not just cast to immutable -in fact, now that I think of it -yes? -oh no, john doesn't like the idea of return an nsdata that in reality is an nsmutabledata right -I still don't see the problem -as we don't allow editing of the mutable array directly, you ONLY need to publish the NSData if your mutable class returns a pointer to its mutable ivar... and tyhe user thinks it gets an immutable data... because remember that ALL editing should go through methods we shouldn't allow editing of the array directly!! then the user might keep that object around thinking it won't change that would make syncing impossible when it fact it will change as the sequence is edited -The user WILL NOT edit the NSData but will see it changed!!! -true -this was my point!!! Yeah, you got it?? -yep -so now how to solve it -basically we need the same approach as NSArray NSMutableArray -There is only one way to solve it: return a true NSData by copying it... -how did they solve it then? -that would make it slow -and especially the immutable class is added for performance reasons! -there is another solution -wait...wait... -the immutable class don't need to copy it, because it won't change -true -only the mutable class should copy it -yes, -but what you are saying is to have the mutableversion have an extra (private) ivar to store the data to improve performance, like I said in the email, you could still retrurn the NSMutableData... sorry I was following on previous stuff -leaving the other one unused? -no, we don't need an extra ivar. -pfew -We just have to be careful inside the class implementation the ivar can be a NSMutableData for the compiler, but we would in fact use NSData for the immutable of course, if we call a mutability method on it, we get a runtime exception, but it should never happen if we a re A BIT careful -i get it -now, back to above. - yes -to improve performance in mutable sequence, like I said in the email, you could still retrurn the NSMutableData... -but also have a flag to say: next time we return the data or mutate the seq... -we can't use the current object pointed by the ivar. Somebody else is using it as NSData -so the flag say: netx time, copy it -if next time never happens, no copy! -hmm, that doesn't sound to nice I think, although functional -performance trick are often anti-good code -true -now, back to the question i had earlier -this is a 'lazy' copy -ok, back to it -how did apple solve the mutable vs immutable code? -in which case? -well, they have NSArray in mutable and immutable form -they use a flag internally. I found that on the cocoadev mailinglist -and NSMutableArray seems to be a subclass of NSArray -these are just header tricks -aha -placeholder classes -just like I did with BCSequence and my recent email -you are saying they are using that lazy copy trick -wowowwowowo -lazy copy trick?? -haha -sorry -ah, ok, yes, they are! -sort of -they only make a real copy when it is a mutable instance -it is probably not lazy, though -not sure -so the implementation of '-copy' is -they do a real copy if flag = mutable -otherwise just copy the pointer -aha -that makes sense -and i don't think they defer the real copy in the -copy method -but the situation is different -the user asks for a copy -it should expect to be done immediately -performance can't be great -yes, you are right -if you ask a copy, you know what you are doing -the thing is that I'm afraid copying is out of the question anyway -don't know... -remember, the user wants direct access to the data -serge seemed to say that this is not that expensive -which can be 300Mb -well, yeah -well, then he should use immutable sequences perhaps -but then the user should use the BCSequence methods -y es, use immutable if you don't want to edit -but then the user should use the BCSequence methods to edit -well, in the end it's inevitable -you don't want the data to change underneath -yes! Inevitable is the word -this should be documented this discussion the concept of mutable/ immutabnle is more subtle that it seems at first it is -but coming back to a discussion -we could copy and paste the whole chat? -ok, back to the discussion -imagine this to mixed with a discussion about 4 types of sub(sub) classes -mutable vs immutable is already difficult enough -well, you read my email? -i truly believe that symbolsets is our typing -it was a bit complicated, no? yes -too much\ -and remember last time we choose such an approach -yes, I know... -i almost had to phone you to ask how it worked complicated for the developer does not mean complicated for the user necessarily -i don't like omni graffle anymore true -and complicated the first time does not mean you have to alwasy rememeber how it works -if you never have to change it -all fine -anyway, at least you got one of my concern -but the one-sequence-for-all is so much simpler (in interface terms) that I think that will pay off big time -probably, yes -i'm willing to give up direct typing -interesting turn of events!! -yes! -the wwdc has certainly created some storm well I'm really enthusiastic about the nsdata -as storage -it's cool! -hafing typed sequences more for the 'expert' user could be fine too -see my last email -no other biox project has it, but I like the idea -yes, if you could wrap it certainly! -i kind of dropped of there -typed sequence would be like the CFArray -for more advanced user! -but that would imply the typed once being the basis and the untyped one the wrapper -and I don't like that too much -the otherway around is fine with me -as it could live a separate life -not necessaarily. I actually propose the oppsite, just as suggested byu John -yes, exactly totally agree, the other way is better, -read the email looke at the omnigraffle thingie -I thought to remember that from your email, i did read it -OK, in the grpah, it is really separate, like a plug-in on the side -I didn't get the CFarray analogy therefore -bad analogy -haha -just something more hidden, less needed -This is a better omnigraffle -less friendly -only one arrow That's good! -and almost straight arrow -but I just didn't feel like thinking too much about it -the concept is there. The implementation can be wrapper of placeholder - it's an add-on/plugin so could wait wrapper OR placeholder -oups -yes could wait or be there and don't care -perfect, the only thing I would like to see is that it doesn't require tricks in the "clean" one-for-all bcsequence system -it does not -perfect! -Well, let's do it haha -you agree the BCSequence header has ALL the methods? -if I have just as much time as you, then we have an even bigger problem - although I would like to spend a few days if I had time -including -complement -i'm enthusiastic about the discussions, although heated That would be the consequence of the approach -yeah, i know about 'time'; -all methods -ok for the methods -I think we should define a limited number of basis methods -let's see what john thinks -the rest will have to be tools -that's the way it is -but think in terms of strider -yes, the tools/method line -I was thinking of all basic editing simple transformations to be in bcsequence -yes, we had a discussiona bout that a few months ago -and more complex things like translations, digestions, alignments in tools -that still holds -yes, perfect -although a basic translation method could be there as convenience method -i'm not sure -it's quite arbitrary -it is arbitrary, yes 21:45 -I need to get back to work ok -nice talking to you again, and thanks for making your point clear -I get it now -thanks for listening!! -Have a nice day at work -I'll copy the discussion to the list -thanks! -thanks for the copy, don't make it lazy ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com LabAssistant - Get your life organized! http://www.mekentosj.com/labassistant ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.parnot at gmail.com Mon Jul 11 17:01:08 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 11 Jul 2005 14:01:08 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> Message-ID: <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> I thought a color version of the chat might be easier to follow charles = purple alex =green ... ok, let's get started on the NSMUtableData? lots of discussion and kind of stuck right So what I mean is: you return an NSMutableData, the compiler sees an NSData, so you can't modify it.. This is good, but... yes, that's what I had in mind but the BCSequence might modify the object later how? that was my point the immutable subclass should alter the data none of its methods should sorry should not The mutable class can modify the ivar, right? can modify the content of the ivar, I should say yes, the mutable class but the mutable class should also return a mutable data object so the muutable class returns the poiinter to that ivar, right yes OK, what if the mutable class returns an NSData (from the header) the immutable class as well, only casted to NSData can't we override the method to be typed as mutabledata? Is there a '-(NSData *)data' method in the mutable class header? that was the only question I had 12:15 PM the question was whether you could override: '-(NSData *)data with '-(NSMutableData *)data I'm not sure no you can't never tried I tried then we have a problem you get a compiler warning you have to do '-(id)data' yep if you look at the NSArray/NSMuutableArray headers,... you have '+(id)array' aha this is because you cant' have +(NSArray *)array and +(NSMutableArray)array i get the idea We should have '-(NSData *)data' for both mut/immut and... add the method '-(NSMUtableData *)mutableData' to the mutable one or not hmm, yes and no in any case, the object returned by '-data' should be immutable and not just cast to immutable in fact, now that I think of it yes? oh no, john doesn't like the idea of return an nsdata that in reality is an nsmutabledata right I still don't see the problem as we don't allow editing of the mutable array directly, you ONLY need to publish the NSData if your mutable class returns a pointer to its mutable ivar... and tyhe user thinks it gets an immutable data... because remember that ALL editing should go through methods we shouldn't allow editing of the array directly!! then the user might keep that object around thinking it won't change that would make syncing impossible when it fact it will change as the sequence is edited true 12:20 PM your right The user WILL NOT edit the NSData but will see it changed!!! true this was my point!!! Yeah, you got it?? yep so now how to solve it basically we need the same approach as NSArray NSMutableArray There is only one way to solve it: return a true NSData by copying it... how did they solve it then? that would make it slow and especially the immutable class is added for performance reasons! there is another solution wait...wait... the immutable class don't need to copy it, because it won't change true only the mutable class should copy it yes, but what you are saying is to have the mutableversion have an extra (private) ivar to store the data to improve performance, like I said in the email, you could still retrurn the NSMutableData... sorry I was following on previous stuff leaving the other one unused? let me answer your question sorry no, we don't need an extra ivar. pfew We just have to be careful inside the class implementation the ivar can be a NSMutableData for the compiler, but we would in fact use NSData for the immutable of course, if we call a mutability method on it, we get a runtime exception, but it should never happen if we a re A BIT careful i get it now, back to above. yes to improve performance in mutable sequence, like I said in the email, you could still retrurn the NSMutableData... 12:25 PM but also have a flag to say: next time we return the data or mutate the seq... we can't use the current object pointed by the ivar. Somebody else is using it as NSData so the flag say: netx time, copy it if next time never happens, no copy! hmm, that doesn't sound to nice I think, although functional performance trick are often anti-good code true now, back to the question i had earlier this is a 'lazy' copy ok, back to it how did apple solve the mutable vs immutable code? in which case? well, they have NSArray in mutable and immutable form they use a flag internally. I found that on the cocoadev mailinglist and NSMutableArray seems to be a subclass of NSArray these are just header tricks aha placeholder classes just like I did with BCSequence and my recent email you are saying they are using that lazy copy trick wowowwowowo lazy copy trick?? haha sorry ah, ok, yes, they are! sort of they only make a real copy when it is a mutable instance it is probably not lazy, though not sure so the implementation of '-copy' is they do a real copy if flag = mutable otherwise just copy the pointer aha that makes sense and i don't think they defer the real copy in the -copy method but the situation is different the user asks for a copy it should expect to be done immediately 12:30 PM performance can't be great yes, you are right if you ask a copy, you know what you are doing the thing is that I'm afraid copying is out of the question anyway don't know... remember, the user wants direct access to the data serge seemed to say that this is not that expensive which can be 300Mb well, yeah well, then he should use immutable sequences perhaps but then the user should use the BCSequence methods yes, use immutable if you don't want to edit but then the user should use the BCSequence methods to edit well, in the end it's inevitable you don't want the data to change underneath yes! Inevitable is the word the concept of mutable/immutabnle is more subtle that it seems at first this should be documented this discussion it is but coming back to a discussion we could copy and paste the whole chat? ok, back to the discussion imagine this to mixed with a discussion about 4 types of sub(sub)classes mutable vs immutable is already difficult enough well, you read my email? i truly believe that symbolsets is our typing it was a bit complicated, no? yes too much\ and remember last time we choose such an approach yes, I know... i almost had to phone you to ask how it worked complicated for the developer does not mean complicated for the user necessarily i don't like omni graffle anymore true and complicated the first time does not mean you have to alwasy rememeber how it works 12:35 PM if you never have to change it all fine anyway, at least you got one of my concern but the one-sequence-for-all is so much simpler (in interface terms) that I think that will pay off big time probably, yes i'm willing to give up direct typing interesting turn of events!! yes! the wwdc has certainly created some storm well I'm really enthusiastic about the nsdata as storage it's cool! hafing typed sequences more for the 'expert' user could be fine too see my last email no other biox project has it, but I like the idea yes, if you could wrap it certainly! i kind of dropped of there typed sequence would be like the CFArray for more advanced user! but that would imply the typed once being the basis and the untyped one the wrapper and I don't like that too much the otherway around is fine with me not necessaarily. I actually propose the oppsite, just as suggested byu John yes, exactly totally agree, the other way is better, as it could live a separate life read the email looke at the omnigraffle thingie I thought to remember that from your email, i did read it OK, in the grpah, it is really separate, like a plug-in on the side I didn't get the CFarray analogy therefore bad analogy haha just something more hidden, less needed This is a better omnigraffle less friendly only one arrow That's good! and almost straight arrow but I just didn't feel like thinking too much about it first, John would like to go for it I guess 12:40 PM the concept is there. The implementation can be wrapper of placeholder and second, it's an add-on/plugin so could wait wrapper OR placeholder oups yes could wait or be there and don't care perfect, the only thing I would like to see is that it doesn't require tricks in the "clean" one-for-all bcsequence system it does not perfect! Well, let's do it haha you agree the BCSequence header has ALL the methods? including -complement if I have just as much time as you, then we have an even bigger problem although I would like to spend a few days if I had time i'm enthusiastic about the discussions, although heated yeah, i know about 'time'; That would be the consequence of the approach all methods ok for the methods I think we should define a limited number of basis methods let's see what john thinks the rest will have to be tools that's the way it is but think in terms of strider yes, the tools/method line I was thinking of all basic editing simple transformations to be in bcsequence yes, we had a discussiona bout that a few months ago and more complex things like translations, digestions, alignments in tools that still holds yes, perfect although a basic translation method could be there as convenience method should we copy paste this whole chat to the mailing list? i'm not sure it's quite arbitrary fine with me it is arbitrary, yes 12:45 PM I need to get back to work ok nice talking to you again, and thanks for making your point clear I get it now thanks for listening!! Have a nice day at work good night! I'll copy the discussion to the list thanks! thanks for the copy, don't make it lazy Cheers Charles, speak to you later cheers From kvddrift at earthlink.net Mon Jul 11 17:16:21 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 11 Jul 2005 17:16:21 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> Message-ID: <58b49be88879e61d9c2e230a333dd12c@earthlink.net> On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote: > I thought a color version of the chat might be easier to follow > charles = purple > alex =green > LOL - it's all black for me :) But I guess in Alex's original mail the indented lines were from Charles (the 'oups' gave it away ;) - Koen. From jtimmer at bellatlantic.net Mon Jul 11 17:31:44 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 11 Jul 2005 17:31:44 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <2a67546c3d1a479af795ba0e4be9b0f2@earthlink.net> Message-ID: > >>> If I create a nucleotide sequence, I want it to act like one. > > It will, because it is made up of nucleotides, which are specified in > the symbolset. Just naming the class BCSequenceNucleotide doesn't make > it a nucleotide sequence. It are the individual BCSymbols that make it > a nucleotide sequence. Focusing on that one comment ignores the whole rest of the paragraph: > Well, it's not just that. Untyped sequences make the sequences stupid - > they don't know what they are, they don't know what operations they can > perform, and they have to have defined responses to cope with requests for > operations they can't perform. Essential methods get scattered in a ton of > other classes (mostly tools, but there's going to be a lot of tools in the > end) - you need to call through to a separate class just to figure out what > type of sequence you have, and something as simple as complementing a > nucleotide sequence requires the creation of a new object and may call > through about 4 different methods there (which is what I hate about BioJava, > as we've discussed). And yes, I hate having to test the sequence type every > time I think about doing anything with the sequence. I could go on with my reasoning, but you've heard it before. Having to check what type of sequence it is before using it for anything, and having to lookup what the return value for a non-sensical method call in order to trap errors is just not a good way for me to work. As a result, it's not something I'm interested in maintaining or extending. Look, everyone else is convinced that this is a good idea. I'm very unlikely to be convinced. This is normal and fine - I've apparently got a different coding style, different plans on what to do with sequences, and different ideas about where to catch errors. If there were any obvious way where we could do both, that would be better, but it's looking like that would be way too inconvenient. Given that, as long as there's a way to let me spin out these classes as they now stand and make future modifications, I'm perfectly happy. It's not worth your time trying to finding individual statements I've made and argue them with me. JT _______________________________________________ This mind intentionally left blank From charles.parnot at gmail.com Mon Jul 11 17:35:19 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 11 Jul 2005 14:35:19 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <58b49be88879e61d9c2e230a333dd12c@earthlink.net> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <58b49be88879e61d9c2e230a333dd12c@earthlink.net> Message-ID: You have to get one of those modern mail client. They even have some with a UI ;-) in the meantime, you could use a web browser (do not use curl if you want the color): http://cmgm.stanford.edu/~cparnot/temp/2005-07-11-chat.html cheers, charles On Jul 11, 2005, at 2:16 PM, Koen van der Drift wrote: > > On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote: > > >> I thought a color version of the chat might be easier to follow >> charles = purple >> alex =green >> >> > > LOL - it's all black for me :) > > But I guess in Alex's original mail the indented lines were from > Charles (the 'oups' gave it away ;) > > - Koen. > > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From mek at mekentosj.com Mon Jul 11 17:51:12 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Mon, 11 Jul 2005 23:51:12 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <58b49be88879e61d9c2e230a333dd12c@earthlink.net> Message-ID: <7F3A8C3E-9251-46CC-AB8F-7A1D62DEA15B@mekentosj.com> haha, some don't like the new menu bar, but I thought Mail in Tiger was supposed to be modern, still all black here as well ;-) Alex On 11-jul-2005, at 23:35, Charles Parnot wrote: > You have to get one of those modern mail client. They even have > some with a UI ;-) > > in the meantime, you could use a web browser (do not use curl if > you want the color): > http://cmgm.stanford.edu/~cparnot/temp/2005-07-11-chat.html > > cheers, > > charles > > > On Jul 11, 2005, at 2:16 PM, Koen van der Drift wrote: > > >> >> On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote: >> >> >> >>> I thought a color version of the chat might be easier to follow >>> charles = purple >>> alex =green >>> >>> >>> >> >> LOL - it's all black for me :) >> >> But I guess in Alex's original mail the indented lines were from >> Charles (the 'oups' gave it away ;) >> >> - Koen. >> >> >> > > -- > Xgrid-at-Stanford > Help science move fast forward: > http://cmgm.stanford.edu/~cparnot/xgrid-stanford > > Charles Parnot > charles.parnot at gmail.com > > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From jtimmer at bellatlantic.net Mon Jul 11 22:22:06 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Mon, 11 Jul 2005 22:22:06 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> Message-ID: Man, even with colors, that conversation was hard to read. The chart Charles provided the link to looks fine, though. One thing that occurred to me this evening - right now, we have codon sequences, which are requisite intermediaries in the translation process. If we're not using symbols, there's a reasonable chance that you'd want to re-work translation anyway, but if you do have a sequence of codons, they can't really be held by the new data-based sequence object very conveniently. JT _______________________________________________ This mind intentionally left blank From mek at mekentosj.com Tue Jul 12 02:04:01 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Tue, 12 Jul 2005 08:04:01 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: Message-ID: <7E0763F6-8A31-471C-81CC-271D3D37F6A3@mekentosj.com> Yes, very good point.. I think that indeed we can't have a codon sequence unless we map all codons to a unique char... which comes quite close to amino acids ;-) Alex On 12-jul-2005, at 4:22, John Timmer wrote: > Man, even with colors, that conversation was hard to read. > > The chart Charles provided the link to looks fine, though. > > One thing that occurred to me this evening - right now, we have codon > sequences, which are requisite intermediaries in the translation > process. > If we're not using symbols, there's a reasonable chance that you'd > want to > re-work translation anyway, but if you do have a sequence of > codons, they > can't really be held by the new data-based sequence object very > conveniently. > > JT > > _______________________________________________ > This mind intentionally left blank > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com 4Peaks - For Peaks, Four Peaks. 2004 Winner of the Apple Design Awards Best Mac OS X Student Product http://www.mekentosj.com/4peaks ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Tue Jul 12 20:02:38 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 12 Jul 2005 20:02:38 -0400 Subject: [Biococoa-dev] cocoa design patterns Message-ID: Hi, For those of you who haven't read it yet, Apple has posted an interesting overview of the use of design patterns in Cocoa/ObjC: http://developer.apple.com/documentation/Cocoa/Conceptual/ CocoaDesignPatterns/index.html cheers, - Koen. From kvddrift at earthlink.net Wed Jul 13 09:19:30 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 13 Jul 2005 09:19:30 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> Message-ID: <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> Hi, After all the talking in the last week or so, I felt like coding again was playing last night with a possible new BCSequence-only structure. It will also include the char string and NSData ivars. For starters, I want to do just the immutable version, of course we can add stuff for the mutable sequence too later. I need to read the chat log again to see if you guys came up with a good solution for that. However, so far I have: BCSequence const char *sequence; NSData *sequenceData; NSArray *symbolArray; BCSymbolSet *symbolSet; We can re-use most of the methods that are now in BCSequence and BCAbstractSequence, including the code that guesses the sequence type if there is no symbolset defined. I am not sure if we also should add the BCSequenceType back in there. I think the symbolset is enough. Let me know what you guys think of this, and if this is a good way forward. In order not to screw up the project I won't commit anything until we all agree on this (or another) approach. I will make some code availabe for download, though. cheers, - Koen. On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote: > I thought a color version of the chat might be easier to follow > charles = purple > alex =green > > > ... > ok, let's get started on the NSMUtableData? > > lots of discussion and kind of stuck right > > So what I mean is: you return an NSMutableData, the compiler sees > an NSData, so you can't modify it.. This is good, but... > > yes, that's what I had in mind > > but the BCSequence might modify the object later > > how? > > that was my point > > the immutable subclass should alter the data > > none of its methods should > > sorry should not > > The mutable class can modify the ivar, right? > > can modify the content of the ivar, I should say > > yes, the mutable class > > but the mutable class should also return a mutable data object > > so the muutable class returns the poiinter to that ivar, right > > yes > > OK, what if the mutable class returns an NSData (from the header) > > the immutable class as well, only casted to NSData > > can't we override the method to be typed as mutabledata? > > Is there a '-(NSData *)data' method in the mutable class header? > > that was the only question I had > > 12:15 PM > the question was whether you could override: '-(NSData *)data > > with '-(NSMutableData *)data > > I'm not sure > > no you can't > > never tried > > I tried > > then we have a problem > > you get a compiler warning > > you have to do '-(id)data' > > yep > > if you look at the NSArray/NSMuutableArray headers,... > > you have '+(id)array' > > aha > > this is because you cant' have +(NSArray *)array > > and +(NSMutableArray)array > > i get the idea > > We should have '-(NSData *)data' for both mut/immut and... > > add the method '-(NSMUtableData *)mutableData' to the mutable one > > or not > > hmm, yes and no > > in any case, the object returned by '-data' should be immutable and > not just cast to immutable > > in fact, now that I think of it > > yes? > > oh no, john doesn't like the idea of return an nsdata that in > reality is an nsmutabledata right > > I still don't see the problem > > as we don't allow editing of the mutable array directly, > > you ONLY need to publish the NSData > > if your mutable class returns a pointer to its mutable ivar... > > and tyhe user thinks it gets an immutable data... > > because remember that ALL editing should go through methods > > we shouldn't allow editing of the array directly!! > > then the user might keep that object around thinking it won't change > > that would make syncing impossible > > when it fact it will change as the sequence is edited > > true > > 12:20 PM > your right > > The user WILL NOT edit the NSData but will see it changed!!! > > true > > this was my point!!! Yeah, you got it?? > > yep > > > so now how to solve it > > basically we need the same approach as NSArray NSMutableArray > > There is only one way to solve it: return a true NSData by copying > it... > > how did they solve it then? > > that would make it slow > > and especially the immutable class is added for performance reasons! > > there is another solution > > wait...wait... > > the immutable class don't need to copy it, because it won't change > > true > > only the mutable class should copy it > > yes, > > but what you are saying is to have the mutableversion have an extra > (private) ivar to store the data > > to improve performance, like I said in the email, you could still > retrurn the NSMutableData... > > sorry I was following on previous stuff > > leaving the other one unused? > > let me answer your question > > sorry > > no, we don't need an extra ivar. > > pfew > > We just have to be careful inside the class implementation > > the ivar can be a NSMutableData for the compiler, but we would in > fact use NSData for the immutable > > of course, if we call a mutability method on it, we get a runtime > exception, but it should never happen if we a re A BIT careful > > i get it > > now, back to above. > > yes > > to improve performance in mutable sequence, like I said in the > email, you could still retrurn the NSMutableData... > > 12:25 PM > but also have a flag to say: next time we return the data or mutate > the seq... > > we can't use the current object pointed by the ivar. Somebody else > is using it as NSData > > so the flag say: netx time, copy it > > if next time never happens, no copy! > > hmm, that doesn't sound to nice I think, although functional > > performance trick are often anti-good code > > true > > now, back to the question i had earlier > > this is a 'lazy' copy > > ok, back to it > > how did apple solve the mutable vs immutable code? > > in which case? > > well, they have NSArray in mutable and immutable form > > they use a flag internally. I found that on the cocoadev mailinglist > > and NSMutableArray seems to be a subclass of NSArray > > these are just header tricks > > aha > > placeholder classes > > just like I did with BCSequence and my recent email > > you are saying they are using that lazy copy trick > > wowowwowowo > > lazy copy trick?? > > haha > > sorry > > ah, ok, yes, they are! > > sort of > > they only make a real copy when it is a mutable instance > > it is probably not lazy, though > > not sure > > so the implementation of '-copy' is > > they do a real copy if flag = mutable > > otherwise just copy the pointer > > aha > > that makes sense > > and i don't think they defer the real copy in the -copy method > > but the situation is different > > the user asks for a copy > > it should expect to be done immediately > > 12:30 PM > performance can't be great > > yes, you are right > > if you ask a copy, you know what you are doing > > the thing is that I'm afraid copying is out of the question anyway > > don't know... > > remember, the user wants direct access to the data > > serge seemed to say that this is not that expensive > > which can be 300Mb > > well, yeah > > well, then he should use immutable sequences perhaps > > but then the user should use the BCSequence methods > > yes, use immutable if you don't want to edit > > but then the user should use the BCSequence methods to edit > > well, in the end it's inevitable > > you don't want the data to change underneath > > yes! Inevitable is the word > > the concept of mutable/immutabnle is more subtle that it seems at > first > > this should be documented this discussion > > it is > > but coming back to a discussion > > we could copy and paste the whole chat? > > ok, back to the discussion > > imagine this to mixed with a discussion about 4 types of sub(sub) > classes > > mutable vs immutable is already difficult enough > > well, you read my email? > > i truly believe that symbolsets is our typing > > it was a bit complicated, no? > > yes > > too much\ > > and remember last time we choose such an approach > > yes, I know... > > i almost had to phone you to ask how it worked > > complicated for the developer does not mean complicated for the > user necessarily > > i don't like omni graffle anymore > > true > > and complicated the first time does not mean you have to alwasy > rememeber how it works > > 12:35 PM > if you never have to change it > > all fine > > anyway, at least you got one of my concern > > but the one-sequence-for-all is so much simpler (in interface > terms) that I think that will pay off big time > > probably, yes > > i'm willing to give up direct typing > > interesting turn of events!! > > yes! > > the wwdc has certainly created some storm > > well I'm really enthusiastic about the nsdata > > as storage > > it's cool! > > hafing typed sequences more for the 'expert' user could be fine too > > see my last email > > no other biox project has it, but I like the idea > > yes, if you could wrap it certainly! > > i kind of dropped of there > > typed sequence would be like the CFArray > > for more advanced user! > > but that would imply the typed once being the basis and the untyped > one the wrapper > > and I don't like that too much > > the otherway around is fine with me > > not necessaarily. I actually propose the oppsite, just as suggested > byu John > > yes, exactly totally agree, the other way is better, > > as it could live a separate life > > read the email > > looke at the omnigraffle thingie > > I thought to remember that from your email, i did read it > > OK, in the grpah, it is really separate, like a plug-in on the side > > I didn't get the CFarray analogy therefore > > bad analogy > > haha > > just something more hidden, less needed > > This is a better omnigraffle > > less friendly > > only one arrow > > That's good! > > and almost straight arrow > > but I just didn't feel like thinking too much about it > > first, John would like to go for it I guess > > 12:40 PM > the concept is there. The implementation can be wrapper of placeholder > > and second, it's an add-on/plugin so could wait > > wrapper OR placeholder > > oups > > yes could wait or be there and don't care > > perfect, the only thing I would like to see is that it doesn't > require tricks in the "clean" one-for-all bcsequence system > > it does not > > perfect! > > Well, let's do it > > haha > > you agree the BCSequence header has ALL the methods? > > including -complement > > if I have just as much time as you, then we have an even bigger > problem > > although I would like to spend a few days if I had time > > i'm enthusiastic about the discussions, although heated > > yeah, i know about 'time'; > > That would be the consequence of the approach > > all methods > > ok for the methods > > I think we should define a limited number of basis methods > > let's see what john thinks > > the rest will have to be tools > > that's the way it is > > but think in terms of strider > > yes, the tools/method line > > I was thinking of all basic editing simple transformations to be in > bcsequence > > yes, we had a discussiona bout that a few months ago > > and more complex things like translations, digestions, alignments > in tools > > that still holds > > yes, perfect > > although a basic translation method could be there as convenience > method > > should we copy paste this whole chat to the mailing list? > > i'm not sure > > it's quite arbitrary > > fine with me > > it is arbitrary, yes > > 12:45 PM > I need to get back to work > > ok > > nice talking to you again, and thanks for making your point clear > > I get it now > > thanks for listening!! > > Have a nice day at work > > good night! > > I'll copy the discussion to the list > > thanks! > > thanks for the copy, don't make it lazy > > Cheers Charles, > > speak to you later > > cheers > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > From kvddrift at earthlink.net Wed Jul 13 09:50:36 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 13 Jul 2005 09:50:36 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> Message-ID: On Jul 13, 2005, at 9:44 AM, Alexander Griekspoor wrote: > I thought that the idea was to have the NSData be the char array, > so no separate const char*, that IS the NSData object already. > So to go over the list >> BCSequence >> const char *sequence; > -> is NSData so this one goes away, if you want access to the > data, you would do: [sequenceData bytes]; which gives you the pointer Good point. Any preference for the ivar name? I would prefer sequenceData over sequence for the NSData member. >> NSData *sequenceData; >> NSArray *symbolArray; > -> this one only if we decide to cache it, otherwise, this would > be a method only. So what was the consensus again on this? Did we have one ;-) Still undecided, I guess. cheers, - Koen. From a.griekspoor at nki.nl Wed Jul 13 10:01:22 2005 From: a.griekspoor at nki.nl (Alexander Griekspoor) Date: Wed, 13 Jul 2005 16:01:22 +0200 Subject: Fwd: [Biococoa-dev] Sequence Structure References: Message-ID: <0C351FC7-35D4-4663-A751-019A011E5F75@nki.nl> To long for the list, so here's the relevant part... Begin forwarded message: > From: Alexander Griekspoor > Date: 13 juli 2005 15:44:06 GMT+02:00 > To: Koen van der Drift > Cc: Charles Parnot , BioCocoa-dev > > Subject: Re: [Biococoa-dev] Sequence Structure > > > Just a quick remark: > > I thought that the idea was to have the NSData be the char array, > so no separate const char*, that IS the NSData object already. > So to go over the list >> BCSequence >> const char *sequence; > -> is NSData so this one goes away, if you want access to the > data, you would do: [sequenceData bytes]; which gives you the pointer >> NSData *sequenceData; >> NSArray *symbolArray; > -> this one only if we decide to cache it, otherwise, this would > be a method only. So what was the consensus again on this? >> BCSymbolSet *symbolSet; > > Cheers, > Alex > > > On 13-jul-2005, at 15:19, Koen van der Drift wrote: > >> Hi, >> >> After all the talking in the last week or so, I felt like coding >> again was playing last night with a possible new BCSequence-only >> structure. It will also include the char string and NSData ivars. >> For starters, I want to do just the immutable version, of course >> we can add stuff for the mutable sequence too later. I need to >> read the chat log again to see if you guys came up with a good >> solution for that. >> >> However, so far I have: >> >> BCSequence >> const char *sequence; >> NSData *sequenceData; >> NSArray *symbolArray; >> BCSymbolSet *symbolSet; >> >> We can re-use most of the methods that are now in BCSequence and >> BCAbstractSequence, including the code that guesses the sequence >> type if there is no symbolset defined. I am not sure if we also >> should add the BCSequenceType back in there. I think the symbolset >> is enough. >> >> Let me know what you guys think of this, and if this is a good way >> forward. In order not to screw up the project I won't commit >> anything until we all agree on this (or another) approach. I will >> make some code availabe for download, though. >> >> cheers, >> >> - Koen. >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Wed Jul 13 09:44:06 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 13 Jul 2005 15:44:06 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> Message-ID: Just a quick remark: I thought that the idea was to have the NSData be the char array, so no separate const char*, that IS the NSData object already. So to go over the list > BCSequence > const char *sequence; -> is NSData so this one goes away, if you want access to the data, you would do: [sequenceData bytes]; which gives you the pointer > NSData *sequenceData; > NSArray *symbolArray; -> this one only if we decide to cache it, otherwise, this would be a method only. So what was the consensus again on this? > BCSymbolSet *symbolSet; Cheers, Alex On 13-jul-2005, at 15:19, Koen van der Drift wrote: > Hi, > > After all the talking in the last week or so, I felt like coding > again was playing last night with a possible new BCSequence-only > structure. It will also include the char string and NSData ivars. > For starters, I want to do just the immutable version, of course we > can add stuff for the mutable sequence too later. I need to read > the chat log again to see if you guys came up with a good solution > for that. > > However, so far I have: > > BCSequence > const char *sequence; > NSData *sequenceData; > NSArray *symbolArray; > BCSymbolSet *symbolSet; > > We can re-use most of the methods that are now in BCSequence and > BCAbstractSequence, including the code that guesses the sequence > type if there is no symbolset defined. I am not sure if we also > should add the BCSequenceType back in there. I think the symbolset > is enough. > > Let me know what you guys think of this, and if this is a good way > forward. In order not to screw up the project I won't commit > anything until we all agree on this (or another) approach. I will > make some code availabe for download, though. > > cheers, > > - Koen. > > > On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote: > > >> I thought a color version of the chat might be easier to follow >> charles = purple >> alex =green >> >> >> ... >> ok, let's get started on the NSMUtableData? >> >> lots of discussion and kind of stuck right >> >> So what I mean is: you return an NSMutableData, the compiler sees >> an NSData, so you can't modify it.. This is good, but... >> >> yes, that's what I had in mind >> >> but the BCSequence might modify the object later >> >> how? >> >> that was my point >> >> the immutable subclass should alter the data >> >> none of its methods should >> >> sorry should not >> >> The mutable class can modify the ivar, right? >> >> can modify the content of the ivar, I should say >> >> yes, the mutable class >> >> but the mutable class should also return a mutable data object >> >> so the muutable class returns the poiinter to that ivar, right >> >> yes >> >> OK, what if the mutable class returns an NSData (from the header) >> >> the immutable class as well, only casted to NSData >> >> can't we override the method to be typed as mutabledata? >> >> Is there a '-(NSData *)data' method in the mutable class header? >> >> that was the only question I had >> >> 12:15 PM >> the question was whether you could override: '-(NSData *)data >> >> with '-(NSMutableData *)data >> >> I'm not sure >> >> no you can't >> >> never tried >> >> I tried >> >> then we have a problem >> >> you get a compiler warning >> >> you have to do '-(id)data' >> >> yep >> >> if you look at the NSArray/NSMuutableArray headers,... >> >> you have '+(id)array' >> >> aha >> >> this is because you cant' have +(NSArray *)array >> >> and +(NSMutableArray)array >> >> i get the idea >> >> We should have '-(NSData *)data' for both mut/immut and... >> >> add the method '-(NSMUtableData *)mutableData' to the mutable one >> >> or not >> >> hmm, yes and no >> >> in any case, the object returned by '-data' should be immutable >> and not just cast to immutable >> >> in fact, now that I think of it >> >> yes? >> >> oh no, john doesn't like the idea of return an nsdata that in >> reality is an nsmutabledata right >> >> I still don't see the problem >> >> as we don't allow editing of the mutable array directly, >> >> you ONLY need to publish the NSData >> >> if your mutable class returns a pointer to its mutable ivar... >> >> and tyhe user thinks it gets an immutable data... >> >> because remember that ALL editing should go through methods >> >> we shouldn't allow editing of the array directly!! >> >> then the user might keep that object around thinking it won't change >> >> that would make syncing impossible >> >> when it fact it will change as the sequence is edited >> >> true >> >> 12:20 PM >> your right >> >> The user WILL NOT edit the NSData but will see it changed!!! >> >> true >> >> this was my point!!! Yeah, you got it?? >> >> yep >> >> >> so now how to solve it >> >> basically we need the same approach as NSArray NSMutableArray >> >> There is only one way to solve it: return a true NSData by copying >> it... >> >> how did they solve it then? >> >> that would make it slow >> >> and especially the immutable class is added for performance reasons! >> >> there is another solution >> >> wait...wait... >> >> the immutable class don't need to copy it, because it won't change >> >> true >> >> only the mutable class should copy it >> >> yes, >> >> but what you are saying is to have the mutableversion have an >> extra (private) ivar to store the data >> >> to improve performance, like I said in the email, you could still >> retrurn the NSMutableData... >> >> sorry I was following on previous stuff >> >> leaving the other one unused? >> >> let me answer your question >> >> sorry >> >> no, we don't need an extra ivar. >> >> pfew >> >> We just have to be careful inside the class implementation >> >> the ivar can be a NSMutableData for the compiler, but we would in >> fact use NSData for the immutable >> >> of course, if we call a mutability method on it, we get a runtime >> exception, but it should never happen if we a re A BIT careful >> >> i get it >> >> now, back to above. >> >> yes >> >> to improve performance in mutable sequence, like I said in the >> email, you could still retrurn the NSMutableData... >> >> 12:25 PM >> but also have a flag to say: next time we return the data or >> mutate the seq... >> >> we can't use the current object pointed by the ivar. Somebody else >> is using it as NSData >> >> so the flag say: netx time, copy it >> >> if next time never happens, no copy! >> >> hmm, that doesn't sound to nice I think, although functional >> >> performance trick are often anti-good code >> >> true >> >> now, back to the question i had earlier >> >> this is a 'lazy' copy >> >> ok, back to it >> >> how did apple solve the mutable vs immutable code? >> >> in which case? >> >> well, they have NSArray in mutable and immutable form >> >> they use a flag internally. I found that on the cocoadev mailinglist >> >> and NSMutableArray seems to be a subclass of NSArray >> >> these are just header tricks >> >> aha >> >> placeholder classes >> >> just like I did with BCSequence and my recent email >> >> you are saying they are using that lazy copy trick >> >> wowowwowowo >> >> lazy copy trick?? >> >> haha >> >> sorry >> >> ah, ok, yes, they are! >> >> sort of >> >> they only make a real copy when it is a mutable instance >> >> it is probably not lazy, though >> >> not sure >> >> so the implementation of '-copy' is >> >> they do a real copy if flag = mutable >> >> otherwise just copy the pointer >> >> aha >> >> that makes sense >> >> and i don't think they defer the real copy in the -copy method >> >> but the situation is different >> >> the user asks for a copy >> >> it should expect to be done immediately >> >> 12:30 PM >> performance can't be great >> >> yes, you are right >> >> if you ask a copy, you know what you are doing >> >> the thing is that I'm afraid copying is out of the question anyway >> >> don't know... >> >> remember, the user wants direct access to the data >> >> serge seemed to say that this is not that expensive >> >> which can be 300Mb >> >> well, yeah >> >> well, then he should use immutable sequences perhaps >> >> but then the user should use the BCSequence methods >> >> yes, use immutable if you don't want to edit >> >> but then the user should use the BCSequence methods to edit >> >> well, in the end it's inevitable >> >> you don't want the data to change underneath >> >> yes! Inevitable is the word >> >> the concept of mutable/immutabnle is more subtle that it seems at >> first >> >> this should be documented this discussion >> >> it is >> >> but coming back to a discussion >> >> we could copy and paste the whole chat? >> >> ok, back to the discussion >> >> imagine this to mixed with a discussion about 4 types of sub(sub) >> classes >> >> mutable vs immutable is already difficult enough >> >> well, you read my email? >> >> i truly believe that symbolsets is our typing >> >> it was a bit complicated, no? >> >> yes >> >> too much\ >> >> and remember last time we choose such an approach >> >> yes, I know... >> >> i almost had to phone you to ask how it worked >> >> complicated for the developer does not mean complicated for the >> user necessarily >> >> i don't like omni graffle anymore >> >> true >> >> and complicated the first time does not mean you have to alwasy >> rememeber how it works >> >> 12:35 PM >> if you never have to change it >> >> all fine >> >> anyway, at least you got one of my concern >> >> but the one-sequence-for-all is so much simpler (in interface >> terms) that I think that will pay off big time >> >> probably, yes >> >> i'm willing to give up direct typing >> >> interesting turn of events!! >> >> yes! >> >> the wwdc has certainly created some storm >> >> well I'm really enthusiastic about the nsdata >> >> as storage >> >> it's cool! >> >> hafing typed sequences more for the 'expert' user could be fine too >> >> see my last email >> >> no other biox project has it, but I like the idea >> >> yes, if you could wrap it certainly! >> >> i kind of dropped of there >> >> typed sequence would be like the CFArray >> >> for more advanced user! >> >> but that would imply the typed once being the basis and the >> untyped one the wrapper >> >> and I don't like that too much >> >> the otherway around is fine with me >> >> not necessaarily. I actually propose the oppsite, just as >> suggested byu John >> >> yes, exactly totally agree, the other way is better, >> >> as it could live a separate life >> >> read the email >> >> looke at the omnigraffle thingie >> >> I thought to remember that from your email, i did read it >> >> OK, in the grpah, it is really separate, like a plug-in on the side >> >> I didn't get the CFarray analogy therefore >> >> bad analogy >> >> haha >> >> just something more hidden, less needed >> >> This is a better omnigraffle >> >> less friendly >> >> only one arrow >> >> That's good! >> >> and almost straight arrow >> >> but I just didn't feel like thinking too much about it >> >> first, John would like to go for it I guess >> >> 12:40 PM >> the concept is there. The implementation can be wrapper of >> placeholder >> >> and second, it's an add-on/plugin so could wait >> >> wrapper OR placeholder >> >> oups >> >> yes could wait or be there and don't care >> >> perfect, the only thing I would like to see is that it doesn't >> require tricks in the "clean" one-for-all bcsequence system >> >> it does not >> >> perfect! >> >> Well, let's do it >> >> haha >> >> you agree the BCSequence header has ALL the methods? >> >> including -complement >> >> if I have just as much time as you, then we have an even bigger >> problem >> >> although I would like to spend a few days if I had time >> >> i'm enthusiastic about the discussions, although heated >> >> yeah, i know about 'time'; >> >> That would be the consequence of the approach >> >> all methods >> >> ok for the methods >> >> I think we should define a limited number of basis methods >> >> let's see what john thinks >> >> the rest will have to be tools >> >> that's the way it is >> >> but think in terms of strider >> >> yes, the tools/method line >> >> I was thinking of all basic editing simple transformations to be >> in bcsequence >> >> yes, we had a discussiona bout that a few months ago >> >> and more complex things like translations, digestions, alignments >> in tools >> >> that still holds >> >> yes, perfect >> >> although a basic translation method could be there as convenience >> method >> >> should we copy paste this whole chat to the mailing list? >> >> i'm not sure >> >> it's quite arbitrary >> >> fine with me >> >> it is arbitrary, yes >> >> 12:45 PM >> I need to get back to work >> >> ok >> >> nice talking to you again, and thanks for making your point clear >> >> I get it now >> >> thanks for listening!! >> >> Have a nice day at work >> >> good night! >> >> I'll copy the discussion to the list >> >> thanks! >> >> thanks for the copy, don't make it lazy >> >> Cheers Charles, >> >> speak to you later >> >> cheers >> >> _______________________________________________ >> Biococoa-dev mailing list >> Biococoa-dev at bioinformatics.org >> https://bioinformatics.org/mailman/listinfo/biococoa-dev >> >> >> > > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From mek at mekentosj.com Wed Jul 13 10:09:06 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 13 Jul 2005 16:09:06 +0200 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> Message-ID: <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> On 13-jul-2005, at 15:50, Koen van der Drift wrote: > > On Jul 13, 2005, at 9:44 AM, Alexander Griekspoor wrote: > > >> I thought that the idea was to have the NSData be the char array, >> so no separate const char*, that IS the NSData object already. >> So to go over the list >> >>> BCSequence >>> const char *sequence; >>> >> -> is NSData so this one goes away, if you want access to the >> data, you would do: [sequenceData bytes]; which gives you the pointer >> > > Good point. Any preference for the ivar name? I would prefer > sequenceData over sequence for the NSData member. Yes, I like sequenceData, or just data would be fine as well. In fact the shorter the better, the fact that you ask a sequence object for its data suggests you will get sequencedata right ? ;-) >>> NSData *sequenceData; >>> NSArray *symbolArray; >>> >> -> this one only if we decide to cache it, otherwise, this would >> be a method only. So what was the consensus again on this? >> > > Did we have one ;-) Still undecided, I guess. My concern is that the caching will potentially increase the size of the object enormously. If we build in the caching in the accessor, even an internal call could invoke caching without the user knowing. This can be pretty dangerous in terms of killing performance and ram requirements. I would propose to have methods to control caching actively, including a method to clear the cache... What do you guys think? Alex ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com Windows vs Mac 65 million years ago, there were more dinosaurs than humans. Where are the dinosaurs now? ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From charles.parnot at gmail.com Wed Jul 13 11:30:21 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 13 Jul 2005 08:30:21 -0700 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> Message-ID: > > My concern is that the caching will potentially increase the size > of the object enormously. If we build in the caching in the > accessor, even an internal call could invoke caching without the > user knowing. This can be pretty dangerous in terms of killing > performance and ram requirements. I would propose to have methods > to control caching actively, including a method to clear the > cache... What do you guys think? > Alex > We should only do the caching for the immutable object. For the mutable one, it does not make much sense. If the user needs the symbol array all the time while changing the sequence string, AND needs performance, the final application is doomed anyway ;-) Then caching on the immutable sequence is trivial. It can be done in the accessor. To avoid memory issues, we could set an arbitrary threshold above which no caching occurs. ABout the sequenceData ivar (probably better to keep the name a bit longer, 'data' would really be too generic, in my opinion; and who knows, maybe later we will need another 'xxxData' ivar), I would type it to a NSMutableData in the interface. And then, in the init method, create an NSData (we might need just one cast there). This way, we won't get compiler warnings in the methods for the mutable class (otherwise we would have to cast the NSData into an NSMutableData in too many places). Have fun coding! BTW, we have to solve the Xcode 2/2.1 issue. charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Wed Jul 13 13:26:16 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 13 Jul 2005 13:26:16 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> Message-ID: <9279FB63-CFB1-4853-B535-99617D90AEB0@earthlink.net> On Jul 13, 2005, at 11:30 AM, Charles Parnot wrote: > We should only do the caching for the immutable object. For the > mutable one, it does not make much sense. If the user needs the > symbol array all the time while changing the sequence string, AND > needs performance, the final application is doomed anyway ;-) Agree :) > > Then caching on the immutable sequence is trivial. It can be done > in the accessor. To avoid memory issues, we could set an arbitrary > threshold above which no caching occurs. > > ABout the sequenceData ivar (probably better to keep the name a bit > longer, 'data' would really be too generic, in my opinion; and who > knows, maybe later we will need another 'xxxData' ivar), I would > type it to a NSMutableData in the interface. And then, in the init > method, create an NSData (we might need just one cast there). This > way, we won't get compiler warnings in the methods for the mutable > class (otherwise we would have to cast the NSData into an > NSMutableData in too many places). I'll try that - I have some more time tonight. > > Have fun coding! BTW, we have to solve the Xcode 2/2.1 issue. I have recently switched to Tiger and Xcode 2.1, so not an issue for me anymore. I think I was the only one not yet on Tiger. - Koen. From kvddrift at earthlink.net Wed Jul 13 23:00:40 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 13 Jul 2005 23:00:40 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> Message-ID: <74EC00A3-5535-4AD9-BDDD-30268EF60011@earthlink.net> Hi, I have made a first version of the new BCSequence class. This class now contains an NSData ivar that holds the sequence as a char array. Init methods have been adapted to create the NSData. The class also merges BCAbstractSequence and BCSequence towards a one-class only BCSequence structure. I left some notes at the top of BCSequence.m and in some of the changed methods. After BCSequence is finished (will it ever?) a lot of classes need to be adapted to use the NSData and/or symbolArray, as well as updated to reflect the removal of BCSequence subclasses. So dropping the new class in the current project will give many errors. Right now I am using a stripped project where most tools, etc have been removed. Probably the best strategy will be to one by one add these classes back and fix them. You can find the files here: http://home.earthlink.net/~kvddrift/biococoa/ And everything is open to change :-) cheers, - Koen. From kvddrift at earthlink.net Fri Jul 15 22:44:40 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 15 Jul 2005 22:44:40 -0400 Subject: [Biococoa-dev] Sequence Structure In-Reply-To: <74EC00A3-5535-4AD9-BDDD-30268EF60011@earthlink.net> References: <01ABF565-3641-410E-8951-F83F1BD79995@gmail.com> <9634D6B7-B55C-42C6-9776-64F18CAA2565@mekentosj.com> <067AD5E5-855A-4996-B011-F4AD14EA5EED@gmail.com> <00B70F13-8AF6-423D-BE08-26183CFB0266@earthlink.net> <789C9F87-656A-466B-8C64-B04B0EB48650@mekentosj.com> <74EC00A3-5535-4AD9-BDDD-30268EF60011@earthlink.net> Message-ID: On Jul 13, 2005, at 11:00 PM, Koen van der Drift wrote: > You can find the files here: > > http://home.earthlink.net/~kvddrift/biococoa/ > I updated some more of the code in BCSequence. At the top of the BCSequence.m file is a todo list, so feel free to have a look at that :) Note that this is still on a stripped down project. The Peptide demo is working, with some small modifications. The Translation demo also works, but I had to comment out the translation and alignment code. I guess these two sections of the code need the most attention. cheers, - Koen. From kvddrift at earthlink.net Sat Jul 16 21:11:16 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sat, 16 Jul 2005 21:11:16 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <5A2F738A-B275-41F3-A368-B8BA684A8C93@earthlink.net> On Jul 6, 2005, at 6:55 PM, John Timmer wrote: >> My preference would be to use the (unsigned) chars. >> > > Okay, that's three of us in agreement now. Is it time to do a > project wide > search for unichar and change it to unsigned char? I have commited this change. The only place in the code where I get a compiler warning because of this, is the line symbolString = [[NSString alloc] initWithCharacters: &aChar length: 1]; in BCSymbol. I changed it to: symbolString = [[NSString alloc] initWithCString: (char *) &aChar length: 1]; It seems to work (no compiler warnings) but I am not sure if this is the right way to go. BTW, since we are already using the symbolChar in BCSymbol, do we still need the symbolString? cheers, - Koen. From kvddrift at earthlink.net Sun Jul 17 07:46:16 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 17 Jul 2005 07:46:16 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <5A2F738A-B275-41F3-A368-B8BA684A8C93@earthlink.net> References: <5A2F738A-B275-41F3-A368-B8BA684A8C93@earthlink.net> Message-ID: On Jul 16, 2005, at 9:11 PM, Koen van der Drift wrote: > I changed it to: > > symbolString = [[NSString alloc] initWithCString: (char *) > &aChar length: 1]; > > It seems to work (no compiler warnings) but I am not sure if this > is the right way to go. Just read in the docs that initWithCString is deprecated in 10.4, so we need to think of something else. The suggested method initWithCString:encoding only works in 10.4, so that's not going to help either. any ideas? cheers, - Koen. From jtimmer at bellatlantic.net Sun Jul 17 08:30:45 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Sun, 17 Jul 2005 08:30:45 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: Message-ID: > > On Jul 16, 2005, at 9:11 PM, Koen van der Drift wrote: > >> I changed it to: >> >> symbolString = [[NSString alloc] initWithCString: (char *) >> &aChar length: 1]; >> >> It seems to work (no compiler warnings) but I am not sure if this >> is the right way to go. > > > Just read in the docs that initWithCString is deprecated in 10.4, so > we need to think of something else. The suggested method > initWithCString:encoding only works in 10.4, so that's not going to > help either. > > any ideas? I think you want "stringWithUTF8String" or some other method containing UTF8. It's the "modern" equivalent of C strings. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Sun Jul 17 08:45:26 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 17 Jul 2005 08:45:26 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: Message-ID: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> On Jul 17, 2005, at 8:30 AM, John Timmer wrote: > I think you want "stringWithUTF8String" or some other method > containing > UTF8. It's the "modern" equivalent of C strings. > Good point, thanks. So we are going with utf8 encoding, then? Another issue, of course I should have used unsigned char* instead of unsigned char. But now the following won't work, because the argument for switch is not an int but a pointer: + (id) symbolForChar: (unsigned char * )aSymbol { switch ( aSymbol ) { Any idea how to fix that? Also, I now need to typecast all these lines: From: asparagineRepresentation = [[BCAminoAcid alloc] initWithSymbolChar: 'N']; to asparagineRepresentation = [[BCAminoAcid alloc] initWithSymbolChar: (unsigned char *) 'N']; I guess that is unavoidable? cheers, - Koen. From charles.parnot at gmail.com Sun Jul 17 12:02:42 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Sun, 17 Jul 2005 09:02:42 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> References: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> Message-ID: I don't see why we can't keep simply + (id) symbolForChar: (unsigned char)aSymbol There is a direct correspondance between BCSymbol and char, so a BCSymbol is defined by a char, not a string. I should have a look at the code to help better, but maybe there is some confusion happpeing when switching between NSString and char* and char... charles On Jul 17, 2005, at 5:45 AM, Koen van der Drift wrote: > > On Jul 17, 2005, at 8:30 AM, John Timmer wrote: > > >> I think you want "stringWithUTF8String" or some other method >> containing >> UTF8. It's the "modern" equivalent of C strings. >> >> > > Good point, thanks. So we are going with utf8 encoding, then? > > Another issue, of course I should have used unsigned char* instead > of unsigned char. But now the following won't work, because the > argument for switch is not an int but a pointer: > > + (id) symbolForChar: (unsigned char * )aSymbol > { > switch ( aSymbol ) { > > > Any idea how to fix that? > > > Also, I now need to typecast all these lines: > > From: > > asparagineRepresentation = [[BCAminoAcid alloc] > initWithSymbolChar: 'N']; > > to > > asparagineRepresentation = [[BCAminoAcid alloc] > initWithSymbolChar: (unsigned char *) 'N']; > > > I guess that is unavoidable? > > cheers, > > - Koen. > > > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Sun Jul 17 12:56:53 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 17 Jul 2005 12:56:53 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: References: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> Message-ID: <57DAC83C-9A82-4A6C-8342-729C7781DCDE@earthlink.net> On Jul 17, 2005, at 12:02 PM, Charles Parnot wrote: > I don't see why we can't keep simply > > + (id) symbolForChar: (unsigned char)aSymbol > > There is a direct correspondance between BCSymbol and char, so a > BCSymbol is defined by a char, not a string. I should have a look > at the code to help better, but maybe there is some confusion > happpeing when switching between NSString and char* and char... > We now use the symbolString to retrieve info from the propertiesDict, eg: symbolInfo = [[[BCAminoAcid aaPropertiesDict] objectForKey: [self symbolString]] copy]; And it is also used in BCCodon. So we might need to keep the symbolString around, otherwise we need to create the NSString on the fly everytime. cheers, - Koen. From charles.parnot at gmail.com Mon Jul 18 11:40:20 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Mon, 18 Jul 2005 08:40:20 -0700 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <57DAC83C-9A82-4A6C-8342-729C7781DCDE@earthlink.net> References: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> <57DAC83C-9A82-4A6C-8342-729C7781DCDE@earthlink.net> Message-ID: <30FE296C-68BE-498F-9E8E-94F091DC2AC8@gmail.com> Sorry, what I meant is we should use + (id) symbolForChar: (unsigned char)aSymbol and not + (id) symbolForChar: (unsigned char *)aSymbol because the latter implies that we are using an array of char for each symbol, and would require us to pass an array of char as argument to the method, instead of just a char. I don't remember all the details of the implementation right now, and I probably should have a look before commenting too much, but I hope you see what I mean :-) BTW, I checked out the code lately, and I have not seen the xcodeproj file in the folder. The project still loads OK (I had to upgrade), but I am not sure I got the right settings, as I had to use the old pbproj file to open the project. charles On Jul 17, 2005, at 9:56 AM, Koen van der Drift wrote: > > > > On Jul 17, 2005, at 12:02 PM, Charles Parnot wrote: > > >> I don't see why we can't keep simply >> >> + (id) symbolForChar: (unsigned char)aSymbol >> >> There is a direct correspondance between BCSymbol and char, so a >> BCSymbol is defined by a char, not a string. I should have a look >> at the code to help better, but maybe there is some confusion >> happpeing when switching between NSString and char* and char... >> >> > > > We now use the symbolString to retrieve info from the > propertiesDict, eg: > > symbolInfo = [[[BCAminoAcid aaPropertiesDict] objectForKey: > [self symbolString]] copy]; > > > And it is also used in BCCodon. > > So we might need to keep the symbolString around, otherwise we need > to create the NSString on the fly everytime. > > > cheers, > > - Koen. > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Mon Jul 18 20:26:21 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 18 Jul 2005 20:26:21 -0400 Subject: [Biococoa-dev] NSMutableData vs malloc In-Reply-To: <30FE296C-68BE-498F-9E8E-94F091DC2AC8@gmail.com> References: <81493014-906C-479D-9C79-C010E75FA4E8@earthlink.net> <57DAC83C-9A82-4A6C-8342-729C7781DCDE@earthlink.net> <30FE296C-68BE-498F-9E8E-94F091DC2AC8@gmail.com> Message-ID: <627D0668-36C8-4CAB-8EA9-23F80D0F0EE2@earthlink.net> On Jul 18, 2005, at 11:40 AM, Charles Parnot wrote: > because the latter implies that we are using an array of char for > each symbol, and would require us to pass an array of char as > argument to the method, instead of just a char. You are very right. I have updated the project and also changed the symbolString init to: symbolString = [[NSString alloc] initWithUTF8String: (const char *) &aChar]; as per John's suggestion. cheers, - Koen. From kvddrift at earthlink.net Mon Jul 18 20:30:06 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 18 Jul 2005 20:30:06 -0400 Subject: [Biococoa-dev] linker warnings Message-ID: <7723AEEE-3C33-4AD0-A0DC-09AEDA0B3031@earthlink.net> Hi, Since I upgraded to Tiger and Xcode 2.1 (I skipped 2.0), I always get these linker warnings: warning: The Mac OS X 10.2.8 SDK does not support ZeroLink; disabling it ld: warning prebinding disabled because (__TEXT segment (address = 0x90000000 size = 0x1a7000) of /usr/lib/libSystem.B.dylib overlaps with __TEXT segment (address = 0x90130000 size = 0xb9000) of /System/ Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation ld: warning prebinding not disabled even though (__LINKEDIT segment (address = 0x901a7000 size = 0x56764) of /usr/lib/libSystem.B.dylib overlaps with __TEXT segment (address = 0x90130000 size = 0xb9000) of /System/Library/Frameworks/CoreFoundation.framework/Versions/A/ CoreFoundation on the assumption that the stripped output will not overlap ld: warning prebinding not disabled even though (__LINKEDIT segment (address = 0x901a7000 size = 0x56764) of /usr/lib/libSystem.B.dylib overlaps with __LINKEDIT segment (address = 0x901e9000 size = 0x1186c) of /System/Library/Frameworks/CoreFoundation.framework/ Versions/A/CoreFoundation on the assumption that the stripped output will not overlap I guess these are harmless, but does anyone know how to get rid of them? thanks, - Koen. From mek at mekentosj.com Tue Jul 19 03:43:03 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Tue, 19 Jul 2005 09:43:03 +0200 Subject: [Biococoa-dev] linker warnings In-Reply-To: <7723AEEE-3C33-4AD0-A0DC-09AEDA0B3031@earthlink.net> References: <7723AEEE-3C33-4AD0-A0DC-09AEDA0B3031@earthlink.net> Message-ID: Zerolink was a development feature introduced in the Panther development tools, the idea of the SDKs is to simulate as if you are working on a machine running that particular OS with its particular feature set. As you have selected the 10.2.8 SDK, you can't have zerolink selected (it wasn't there in Jaguar) and it is disabled. You can remove the warning by disabling zerolinking yourself in the build settings. In XCode 2 they did much more effort to make the SDKs more reliably build a downwards compatible app, this is one of the improvements they made. Cheers, Alex On 19-jul-2005, at 2:30, Koen van der Drift wrote: > Hi, > > Since I upgraded to Tiger and Xcode 2.1 (I skipped 2.0), I always > get these linker warnings: > > warning: The Mac OS X 10.2.8 SDK does not support ZeroLink; > disabling it > > ld: warning prebinding disabled because (__TEXT segment (address = > 0x90000000 size = 0x1a7000) of /usr/lib/libSystem.B.dylib overlaps > with __TEXT segment (address = 0x90130000 size = 0xb9000) of / > System/Library/Frameworks/CoreFoundation.framework/Versions/A/ > CoreFoundation > > ld: warning prebinding not disabled even though (__LINKEDIT segment > (address = 0x901a7000 size = 0x56764) of /usr/lib/libSystem.B.dylib > overlaps with __TEXT segment (address = 0x90130000 size = 0xb9000) > of /System/Library/Frameworks/CoreFoundation.framework/Versions/A/ > CoreFoundation on the assumption that the stripped output will not > overlap > > ld: warning prebinding not disabled even though (__LINKEDIT segment > (address = 0x901a7000 size = 0x56764) of /usr/lib/libSystem.B.dylib > overlaps with __LINKEDIT segment (address = 0x901e9000 size = > 0x1186c) of /System/Library/Frameworks/CoreFoundation.framework/ > Versions/A/CoreFoundation on the assumption that the stripped > output will not overlap > > > I guess these are harmless, but does anyone know how to get rid of > them? > > thanks, > > - Koen. > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 AIM: mekentosj at mac.com E-mail: a.griekspoor at nki.nl Web: http://www.mekentosj.com LabAssistant - Get your life organized! http://www.mekentosj.com/labassistant ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Tue Jul 19 06:49:22 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 06:49:22 -0400 Subject: [Biococoa-dev] initWithUTF8String not working properly Message-ID: <160B07E6-35D4-4245-9336-FF04A1210D24@earthlink.net> On Jul 18, 2005, at 8:26 PM, Koen van der Drift wrote: > I have updated the project and also changed the symbolString init to: > > symbolString = [[NSString alloc] initWithUTF8String: > (const char *) &aChar]; Hmm - this line now sometimes returns nil when initializing the symbols causing a crash later on. I am out of ideas right now (and off to work) , anyone has a chance to look into it? thanks, - Koen. From jtimmer at bellatlantic.net Tue Jul 19 07:53:25 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Tue, 19 Jul 2005 07:53:25 -0400 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: <160B07E6-35D4-4245-9336-FF04A1210D24@earthlink.net> Message-ID: > On Jul 18, 2005, at 8:26 PM, Koen van der Drift wrote: > >> I have updated the project and also changed the symbolString init to: >> >> symbolString = [[NSString alloc] initWithUTF8String: >> (const char *) &aChar]; > > > Hmm - this line now sometimes returns nil when initializing the > symbols causing a crash later on. I am out of ideas right now (and > off to work) , anyone has a chance to look into it? Ah, you're handing it a character, not a zero-terminated string. Try "initWithBytes:length:encoding" using " NSUTF8StringEncoding" JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Tue Jul 19 17:31:03 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 17:31:03 -0400 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: References: Message-ID: On Jul 19, 2005, at 7:53 AM, John Timmer wrote: > Ah, you're handing it a character, not a zero-terminated string. Try > "initWithBytes:length:encoding" using " NSUTF8StringEncoding" > I tried that too, but that gives me the following warning: warning: no '-initWithBytes:length:encoding:' method found warning: (Messages without a matching method signature warning: will be assumed to return 'id' and accept warning: '...' as arguments.) symbolString = [[NSString alloc] initWithBytes: aChar length: sizeof (unsigned char) encoding: NSUTF8StringEncoding]; cheers, - Koen. From charles.parnot at gmail.com Tue Jul 19 17:39:10 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 19 Jul 2005 14:39:10 -0700 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: References: Message-ID: > > symbolString = [[NSString alloc] initWithBytes: aChar length: > sizeof(unsigned char) encoding: NSUTF8StringEncoding]; > > > cheers, > > - Koen. Don't know about the warning, but the code should be symbolString = [[NSString alloc] initWithBytes:&aChar length: sizeof (unsigned char) encoding: NSUTF8StringEncoding]; Actually, maybe the warning will go, as the method signature will then be different. charles NB: and you could safely go with: symbolString = [[NSString alloc] initWithBytes:&aChar length:1 encoding:NSUTF8StringEncoding]; -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 19 18:19:42 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 18:19:42 -0400 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: References: Message-ID: <90F2E1C8-B132-46C3-94F7-F560BAA866E5@earthlink.net> On Jul 19, 2005, at 5:39 PM, Charles Parnot wrote: > Don't know about the warning, but the code should be > > symbolString = [[NSString alloc] initWithBytes:&aChar length: > sizeof(unsigned char) encoding: NSUTF8StringEncoding]; > > Actually, maybe the warning will go, as the method signature will > then be different. > > charles > > NB: and you could safely go with: > > symbolString = [[NSString alloc] initWithBytes:&aChar length:1 > encoding:NSUTF8StringEncoding]; > Nope, I still get the same warning. Maybe has this to do with the SDK we are using? Anyway, the crash is gone now, so that's good news! cheers, - Koen. From charles.parnot at gmail.com Tue Jul 19 18:55:13 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 19 Jul 2005 15:55:13 -0700 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: <90F2E1C8-B132-46C3-94F7-F560BAA866E5@earthlink.net> References: <90F2E1C8-B132-46C3-94F7-F560BAA866E5@earthlink.net> Message-ID: You could also try a cast: symbolString = [[NSString alloc] initWithBytes:(const void*)&aChar length:1 encoding:NSUTF8StringEncoding]; charles NB: you probably also had a crash or weird thing when you used aChar instead of aChar, right? On Jul 19, 2005, at 3:19 PM, Koen van der Drift wrote: > > On Jul 19, 2005, at 5:39 PM, Charles Parnot wrote: > > >> Don't know about the warning, but the code should be >> >> symbolString = [[NSString alloc] initWithBytes:&aChar length: >> sizeof(unsigned char) encoding: NSUTF8StringEncoding]; >> >> Actually, maybe the warning will go, as the method signature will >> then be different. >> >> charles >> >> NB: and you could safely go with: >> >> symbolString = [[NSString alloc] initWithBytes:&aChar length:1 >> encoding:NSUTF8StringEncoding]; >> >> > > > > Nope, I still get the same warning. Maybe has this to do with the > SDK we are using? > > Anyway, the crash is gone now, so that's good news! > > > cheers, > > > - Koen. > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 19 18:57:59 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 18:57:59 -0400 Subject: [Biococoa-dev] NSData Message-ID: Hi, During the conversion to NSData, I ran into a couple of questions. 1. Should the symbolChar in BCSymbol also be an NSData ivar? If not, I guess we need to use malloc/free etc. 2. Why are we using utf8 and not ascii encoding? utf8 seems to be for unicode while we only need single characters. 3. I am still not sire if we should stick with the symbolString in BCSymbol. Any opinions on that one? 4. Slightly unrelated, but I'll ask it here anyway because I notced it while implementing the NSData. In the initBases method in BCNucleotideDNA/RNA I see many lines such as: [baseDefinitions removeObjectForKey: @"A"]; I am not sure if I understand what the idea behind the removal of the key-value pair is. I will probably find more issues and post them here :) BTW, has anyone had a chance to look at the code I posted at http:// home.earthlink.net/~kvddrift/biococoa/ ? Some feedback would be nice, so I can continue making changes to my local project and then commit them to cvs eventually. Do you guys think I am on the right track, let me know what you think. cheers, - Koen. From kvddrift at earthlink.net Tue Jul 19 18:58:54 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 18:58:54 -0400 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: References: <90F2E1C8-B132-46C3-94F7-F560BAA866E5@earthlink.net> Message-ID: On Jul 19, 2005, at 6:55 PM, Charles Parnot wrote: > NB: you probably also had a crash or weird thing when you used > aChar instead of aChar, right? > You mean aChar instead of &aChar? Then the answer is yes ;-) - Koen. From charles.parnot at gmail.com Tue Jul 19 19:25:09 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 19 Jul 2005 16:25:09 -0700 Subject: [Biococoa-dev] NSData In-Reply-To: References: Message-ID: <770BF2E7-3635-41A9-9387-BD9989A2A127@gmail.com> On Jul 19, 2005, at 3:57 PM, Koen van der Drift wrote: > Hi, > > During the conversion to NSData, I ran into a couple of questions. > > 1. Should the symbolChar in BCSymbol also be an NSData ivar? If > not, I guess we need to use malloc/free etc. The symbolChar should just be one char, not a buffer of one char. The ivar should be: ... char symbolChar; ... It seems there is some confusion when you go from NSString to string (buffer of chars terminated with a 0) to char. The code is not that different from before, and I don't think you need to change much stuff in BCSymbol. > 2. Why are we using utf8 and not ascii encoding? utf8 seems to be > for unicode while we only need single characters. It is basically the same thing. UTF8 can do one-byte encoding, so that ascii characters only use one byte. I am not sure what the rules are to encode other unicode characters, but the important thing is: if you feed an NSString with a buffer of chars, and use UTF8 encoding, you will use the one-byte ascii characters, and when you convert back to chars, you will get them back. Cocoa used to treat C string specially, but now, it is recommanded to treat them like other strings, with that particular encoding. It makes things more consistent with the rest. > 3. I am still not sire if we should stick with the symbolString in > BCSymbol. Any opinions on that one? We have it, we should keep it. It could be useful if the symbolString is ever different from the symbolChar. One example could be codons, though it is not clear how feasible it is to make codons into symbols. Anyway, one could decide to have symbolChar be used as an int (e.g. 0-63), and then have the symbolString return the actual codon. > 4. Slightly unrelated, but I'll ask it here anyway because I notced > it while implementing the NSData. In the initBases method in > BCNucleotideDNA/RNA I see many lines such as: > > [baseDefinitions removeObjectForKey: @"A"]; > > I am not sure if I understand what the idea behind the removal of > the key-value pair is. don't know... Sorry! > > I will probably find more issues and post them here :) yes! > BTW, has anyone had a chance to look at the code I posted at http:// > home.earthlink.net/~kvddrift/biococoa/ ? Some feedback would be > nice, so I can continue making changes to my local project and then > commit them to cvs eventually. Do you guys think I am on the right > track, let me know what you think. I have not, but I will try to. One nice thing with CVS is you can see what changed, and you can always revert back to an old state. So I guess you should use CVS? If some source files don't compile, you can always temporarily uncheck the 'Target' box so they are not included in the build. Thanks for doing all the hard work! charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From charles.parnot at gmail.com Tue Jul 19 19:28:57 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 19 Jul 2005 16:28:57 -0700 Subject: [Biococoa-dev] initWithUTF8String not working properly In-Reply-To: References: <90F2E1C8-B132-46C3-94F7-F560BAA866E5@earthlink.net> Message-ID: Yes, this is what I meant!! Anyway, did you try the (const void*)&aChar cast? charles On Jul 19, 2005, at 3:58 PM, Koen van der Drift wrote: > > On Jul 19, 2005, at 6:55 PM, Charles Parnot wrote: > > >> NB: you probably also had a crash or weird thing when you used >> aChar instead of aChar, right? >> >> > > > You mean aChar instead of &aChar? Then the answer is yes ;-) > > - Koen. > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 19 22:08:27 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 22:08:27 -0400 Subject: [Biococoa-dev] NSData In-Reply-To: <770BF2E7-3635-41A9-9387-BD9989A2A127@gmail.com> References: <770BF2E7-3635-41A9-9387-BD9989A2A127@gmail.com> Message-ID: On Jul 19, 2005, at 7:25 PM, Charles Parnot wrote: > > On Jul 19, 2005, at 3:57 PM, Koen van der Drift wrote: > > >> Hi, >> >> During the conversion to NSData, I ran into a couple of questions. >> >> 1. Should the symbolChar in BCSymbol also be an NSData ivar? If >> not, I guess we need to use malloc/free etc. >> > > The symbolChar should just be one char, not a buffer of one char. > The ivar should be: > ... > char symbolChar; I think you meant: unsigned char symbolChar; Can we then in the initWithChar method just use: symbolChar = aChar; ? >> >> I will probably find more issues and post them here :) >> > > yes! OK, here we go :) I am now looking into updating the translation code. This is not my field, so I hope I do it right, and therefore ask here before I really screw up. 1. Should we keep the BCSequenceCodon subclass? 2. The translation code now uses symbolArrays. Should we keep this, or do we need to rewrite these classes so that they use the char array in NSData? > One nice thing with CVS is you can see what changed, and you can > always revert back to an old state. So I guess you should use CVS? > If some source files don't compile, you can always temporarily > uncheck the 'Target' box so they are not included in the build. I rather not commit them to CVS until the classes are more finalized. But if no-one objects, I will commit the updated files. Or I can use a CVS branch. Anyone knows how to use branches within Xcode? cheers, - Koen. From kvddrift at earthlink.net Tue Jul 19 22:23:10 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 22:23:10 -0400 Subject: [Biococoa-dev] linker warnings In-Reply-To: References: <7723AEEE-3C33-4AD0-A0DC-09AEDA0B3031@earthlink.net> Message-ID: <7882CA33-365F-4E7A-A916-0F2A4FA4CE44@earthlink.net> On Jul 19, 2005, at 3:43 AM, Alexander Griekspoor wrote: > Zerolink was a development feature introduced in the Panther > development tools, the idea of the SDKs is to simulate as if you > are working on a machine running that particular OS with its > particular feature set. As you have selected the 10.2.8 SDK, you > can't have zerolink selected (it wasn't there in Jaguar) and it is > disabled. You can remove the warning by disabling zerolinking > yourself in the build settings. In XCode 2 they did much more > effort to make the SDKs more reliably build a downwards compatible > app, this is one of the improvements they made. As far as I can tell ZeroLink is turned off in the project, For the other warnings, these are caused because we use gcc 4 when using the 10.2.8 SDK. http://www.cocoabuilder.com/archive/message/cocoa/2005/6/21/139501 cheers - Koen. From charles.parnot at gmail.com Tue Jul 19 22:53:47 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Tue, 19 Jul 2005 19:53:47 -0700 Subject: [Biococoa-dev] NSData In-Reply-To: References: <770BF2E7-3635-41A9-9387-BD9989A2A127@gmail.com> Message-ID: <960AF92B-2C48-4899-9539-30BAD3AA3461@gmail.com> >> The symbolChar should just be one char, not a buffer of one char. >> The ivar should be: >> ... >> char symbolChar; >> > > I think you meant: > > unsigned char symbolChar; I don't think it makes a lot of difference in the end, as long as we are consistent. > Can we then in the initWithChar method just use: > > symbolChar = aChar; ? Yes!! char is just like int, or float, or BOOL. It just happens to use only one byte. You can pass it around, and it is actually lighter than a pointer. > OK, here we go :) > > I am now looking into updating the translation code. This is not my > field, so I hope I do it right, and therefore ask here before I > really screw up. > > 1. Should we keep the BCSequenceCodon subclass? > > 2. The translation code now uses symbolArrays. Should we keep this, > or do we need to rewrite these classes so that they use the char > array in NSData? The whole translation stuff was set up by John in one flight over the atlantic (or was it the Channel?). It needed some work anyway, but the disappearance of BCSequenceCodon complicates things quite a bit. We could keep a private BCSequenceCodon class for now, or move the relevant code to the translation tool. I like the idea of a BCSequenceCodon type, though. Well, some more debate to come! >> One nice thing with CVS is you can see what changed, and you can >> always revert back to an old state. So I guess you should use CVS? >> If some source files don't compile, you can always temporarily >> uncheck the 'Target' box so they are not included in the build. >> > > > I rather not commit them to CVS until the classes are more > finalized. But if no-one objects, I will commit the updated files. > Or I can use a CVS branch. Anyone knows how to use branches within > Xcode? I don't know how to use branches in CVS, I seem to recall it is a bit tricky and I don't think there is really a need for that. Nobody is working much on the code right now, except you, and things can always be put back the way they were (it is very easy with CVL to look at past versions of each file, and get a piece of the old code back if necessary; it is also possible to just go back in time for the whole project). It is not like we are developing a real separate branch. Also, if you commit changes, we can more easily look at it, and even make minor corrections. This is really the CVS goal, and we should just use that tool and take advantage of it. I see no problem if you commit these changes, as they are needed and there was a decision to go forward. cheers, charles -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Tue Jul 19 23:19:23 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Tue, 19 Jul 2005 23:19:23 -0400 Subject: [Biococoa-dev] updated project in cvs Message-ID: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> Hi, I took the plunge and have committed a lot of changes to the project. These changes now incorporate: 1. The addition of NSData to hold the sequenceData 2. Merging of BCSequence and BCAbstractSequence and removal of sequence subclasses The new way to create a sequence is: BCSequence *mySequence = [[BCSequence alloc] initWithString: @"ELVISLIVES" symbolSet: [BCSymbolSet proteinSymbolSet]]; The project compiles, with a few warnings, and the examples seem to work. There are of course still some parts that need attention. The most important are: 1. Start using the NSData in more places to replace the symbolArray. A good example is the translation code, but also in other sections. BCSequenceCodon is still in the project. 2. Implement the "type-guessing" code in BCSequence. See my notes in initWithString in BCSymbol. For now the examples use fasta files, and the code cannot yet see if it is a protein or dna. Make sure to un- comment the correct symbolSet in readFastaFile. 3. Implementation of a mutable sequence class 4. Update the documentation 5. Update the Test code cheers, - Koen. From mek at mekentosj.com Wed Jul 20 03:26:26 2005 From: mek at mekentosj.com (Alexander Griekspoor) Date: Wed, 20 Jul 2005 09:26:26 +0200 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> Message-ID: <3E9474E1-0560-4533-84B4-7D0FF1F9F7A6@mekentosj.com> Wooohooo!!! Great work Koen, I hope to have a good look at it before I leave for holidays on friday, well done! Cheers, Alex On 20-jul-2005, at 5:19, Koen van der Drift wrote: > Hi, > > I took the plunge and have committed a lot of changes to the > project. These changes now incorporate: > > 1. The addition of NSData to hold the sequenceData > > 2. Merging of BCSequence and BCAbstractSequence and removal of > sequence subclasses > > > The new way to create a sequence is: > > BCSequence *mySequence = [[BCSequence alloc] initWithString: > @"ELVISLIVES" symbolSet: [BCSymbolSet proteinSymbolSet]]; > > > The project compiles, with a few warnings, and the examples seem to > work. There are of course still some parts that need attention. The > most important are: > > 1. Start using the NSData in more places to replace the > symbolArray. A good example is the translation code, but also in > other sections. BCSequenceCodon is still in the project. > > 2. Implement the "type-guessing" code in BCSequence. See my notes > in initWithString in BCSymbol. For now the examples use fasta > files, and the code cannot yet see if it is a protein or dna. Make > sure to un-comment the correct symbolSet in readFastaFile. > > 3. Implementation of a mutable sequence class > > 4. Update the documentation > > 5. Update the Test code > > > > cheers, > > - Koen. > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > > ********************************************************* ** Alexander Griekspoor ** ********************************************************* The Netherlands Cancer Institute Department of Tumorbiology (H4) Plesmanlaan 121, 1066 CX, Amsterdam Tel: + 31 20 - 512 2023 Fax: + 31 20 - 512 2029 E-mail: a.griekspoor at nki.nl AIM: mekentosj at mac.com Web: http://www.mekentosj.com EnzymeX - To cut or not to cut http://www.mekentosj.com/enzymex ********************************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From kvddrift at earthlink.net Wed Jul 20 06:42:00 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 20 Jul 2005 06:42:00 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <3E9474E1-0560-4533-84B4-7D0FF1F9F7A6@mekentosj.com> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <3E9474E1-0560-4533-84B4-7D0FF1F9F7A6@mekentosj.com> Message-ID: <84C01ACE-9707-4EB6-ABE3-9D371442180B@earthlink.net> On Jul 20, 2005, at 3:26 AM, Alexander Griekspoor wrote: > I hope to have a good look at it before I leave for holidays on > friday, well done! One thing occurred to me. I upgraded to Xcode 2.1, but the updated project is not yet in CVS. So you may have to remove the BCSequence subclasses (except BCSequenceCodon) manually from your local project. - Koen. From jtimmer at bellatlantic.net Wed Jul 20 13:54:18 2005 From: jtimmer at bellatlantic.net (John Timmer) Date: Wed, 20 Jul 2005 13:54:18 -0400 Subject: [Biococoa-dev] NSData In-Reply-To: <960AF92B-2C48-4899-9539-30BAD3AA3461@gmail.com> Message-ID: > The whole translation stuff was set up by John in one flight over the > atlantic (or was it the Channel?). It needed some work anyway, but > the disappearance of BCSequenceCodon complicates things quite a bit. > We could keep a private BCSequenceCodon class for now, or move the > relevant code to the translation tool. I like the idea of a > BCSequenceCodon type, though. Well, some more debate to come! It was definitely transatlantic - I couldn't type fast enough to do it across the channel, much less think fast enough. Some of the design decision behind it: I didn't want to initially make a codon class, since they're a bit odd (a sequence and two types of symbols all wrapped in a single class), but Alex convinced me they'd be a good idea. In the end, I got them to work reasonably efficiently, and they do encapsulate the codon matching pretty well and the ambiguous symbols handle the wobble base very cleanly. Also, having them encapsulate symbols and contain sequences meant that many of the optimizations we made on these classes sped up translation "for free", and translation in turn made a nice test case to identify slow code in these classes. Once we had codons, which represent an intermediate in translation, a codon sequence to represent the intermediate state seemed like an obvious choice. This allows you to retain the results of a translation and extract different bits of information (how many ORFs? What's the longest? Give me its description for debugging purposes) from it without re-translating every time. This didn't seem like a waste of memory, since anyone not caring about the translation intermediary would probably just grab the resulting protein sequence anyway and dispose of the codon sequence. As structured, however, I don't think it cached translations when you did things like switch reading frames or genetic codes - adding that feature would make it even more useful (you can ask it more questions, like what's the longest ORF in any reading frame? Could this be a mitochondrial gene?), although it would require it to wrap multiple sequences, rather than being a sequence object. I might have tried adding this to the tool class, but I can't remember. Incidentally, regarding the deletion of symbols from the initial .plist dictionary you asked about: The idea is that a user could add a custom symbol (say they were playing with synthetic nucleotides) simply by adding the information to the .plist file, rather than by coding it. Instead of being present as a singleton (since we're not adding code), we retained a singleton dictionary based on the .plist file. Any non-standard symbol can be retrieved from that at any time (there's a method for it), so they're effectively singletons as well. The symbol deletions are simply getting rid of the standard bases from this dictionary. This should cut down memory use a tiny bit, and should also speed the lookup of custom symbols, by having many fewer key/value pairs in the dictionary. I have no idea if it really does help in a significant way, so it's a bit of a premature optimization, but it was one line of copy/paste code, so it seemed like a reasonable choice. JT _______________________________________________ This mind intentionally left blank From kvddrift at earthlink.net Wed Jul 20 18:47:13 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 20 Jul 2005 18:47:13 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> Message-ID: <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> On Jul 19, 2005, at 11:19 PM, Koen van der Drift wrote: > 2. Implement the "type-guessing" code in BCSequence. See my notes > in initWithString in BCSymbol. For now the examples use fasta > files, and the code cannot yet see if it is a protein or dna. Make > sure to un-comment the correct symbolSet in readFastaFile. This is now fixed in cvs. - Koen. From kvddrift at earthlink.net Wed Jul 20 19:14:50 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 20 Jul 2005 19:14:50 -0400 Subject: [Biococoa-dev] sequence types Message-ID: <44E3260F-C2CA-4897-BCE8-725148B22F50@earthlink.net> Hi, I would like to suggest that we rename the sequence type constants from BCDNASequence to BCSequenceTypeDNA, etc. Although slightly longer, I find it more descriptive and easier to read. any objections? cheers, - Koen. From charles.parnot at gmail.com Wed Jul 20 19:21:38 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Wed, 20 Jul 2005 16:21:38 -0700 Subject: [Biococoa-dev] sequence types In-Reply-To: <44E3260F-C2CA-4897-BCE8-725148B22F50@earthlink.net> References: <44E3260F-C2CA-4897-BCE8-725148B22F50@earthlink.net> Message-ID: <05FE6A7A-A759-4D0F-83E1-5BD50717BB97@gmail.com> agreed! On Jul 20, 2005, at 4:14 PM, Koen van der Drift wrote: > Hi, > > I would like to suggest that we rename the sequence type constants > from BCDNASequence to BCSequenceTypeDNA, etc. Although slightly > longer, I find it more descriptive and easier to read. > > any objections? > > > cheers, > > - Koen. > _______________________________________________ > Biococoa-dev mailing list > Biococoa-dev at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biococoa-dev > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Wed Jul 20 19:26:26 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Wed, 20 Jul 2005 19:26:26 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> Message-ID: On Jul 20, 2005, at 6:47 PM, Koen van der Drift wrote: >> 2. Implement the "type-guessing" code in BCSequence. See my notes >> in initWithString in BCSymbol. For now the examples use fasta >> files, and the code cannot yet see if it is a protein or dna. Make >> sure to un-comment the correct symbolSet in readFastaFile. >> > > > This is now fixed in cvs. > Well, it worked for the translation demo. However, when I start the Peptide demo, it creates a DNA anyway. Charles, I think you wrote the original guess code. Would you mind having a look at it ? cheers, - Koen. From kvddrift at earthlink.net Thu Jul 21 22:16:32 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Thu, 21 Jul 2005 22:16:32 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> Message-ID: <230CDA50-7B64-4476-B7B4-5CDBA67B84B7@earthlink.net> On Jul 20, 2005, at 7:26 PM, Koen van der Drift wrote: > Well, it worked for the translation demo. However, when I start the > Peptide demo, it creates a DNA anyway. Charles, I think you wrote > the original guess code. Would you mind having a look at it ? > I got a bit further with it, but it is still not working as it should. I added a new method to BCSymbolSet, stringByRemovingUnknownCharsFromString that should filter a string and remove all characters that are not in the symbolSet. Because BCSymbolSet has a method containsSymbol, I need to convert each character in the string to a unsigned char, then get the symbol, and then do the test. That's all fine, and the guessing code now works as it should for the Peptides demo. However, for the Translation demo, it crashes at the line [result appendString: [NSString stringWithUTF8String: bytes]]; Anyone has an idea what I am doing wrong here? Maybe I should add all the bytes together, and then convert it back to an NSString? Not sure how to do that, though. thanks, - Koen. From charles.parnot at gmail.com Fri Jul 22 01:41:55 2005 From: charles.parnot at gmail.com (Charles Parnot) Date: Thu, 21 Jul 2005 22:41:55 -0700 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <230CDA50-7B64-4476-B7B4-5CDBA67B84B7@earthlink.net> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> <230CDA50-7B64-4476-B7B4-5CDBA67B84B7@earthlink.net> Message-ID: <7991300C-9057-4836-AB1B-A58898DE95EA@gmail.com> A wild guess: bytes is not terminated by a \000 char. Try 'printf ( "% s", bytes );' and see if it crashes too. You should then instead use stringWithBytes:length:encoding charles On Jul 21, 2005, at 7:16 PM, Koen van der Drift wrote: > > On Jul 20, 2005, at 7:26 PM, Koen van der Drift wrote: > > >> Well, it worked for the translation demo. However, when I start >> the Peptide demo, it creates a DNA anyway. Charles, I think you >> wrote the original guess code. Would you mind having a look at it ? >> >> > > I got a bit further with it, but it is still not working as it > should. I added a new method to BCSymbolSet, > stringByRemovingUnknownCharsFromString that should filter a string > and remove all characters that are not in the symbolSet. Because > BCSymbolSet has a method containsSymbol, I need to convert each > character in the string to a unsigned char, then get the symbol, > and then do the test. That's all fine, and the guessing code now > works as it should for the Peptides demo. However, for the > Translation demo, it crashes at the line > > [result appendString: [NSString stringWithUTF8String: bytes]]; > > > Anyone has an idea what I am doing wrong here? Maybe I should add > all the bytes together, and then convert it back to an NSString? > Not sure how to do that, though. > > > thanks, > > - Koen. > -- Xgrid-at-Stanford Help science move fast forward: http://cmgm.stanford.edu/~cparnot/xgrid-stanford Charles Parnot charles.parnot at gmail.com From kvddrift at earthlink.net Fri Jul 22 17:49:30 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 22 Jul 2005 17:49:30 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <7991300C-9057-4836-AB1B-A58898DE95EA@gmail.com> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> <230CDA50-7B64-4476-B7B4-5CDBA67B84B7@earthlink.net> <7991300C-9057-4836-AB1B-A58898DE95EA@gmail.com> Message-ID: <2FDEC4C0-E6FF-4F4D-94AC-BA743DE2422A@earthlink.net> On Jul 22, 2005, at 1:41 AM, Charles Parnot wrote: > You should then instead use stringWithBytes:length:encoding > That method doesn't exist in Cocoa. However, initWithBytes does, and that seems to work. Thanks for pointing that out. I actually added a category method for stringWithBytes to BCStringUtils, but the compiler doesn't see it, so I added the initWithBytes directly in the code. I also discovered the dataUsingEncoding method which directly converts an NSString to an NSData object. Anyway, everything seems to be working (I cross my fingers!). The NSData and sequence type guessing code probably needs some clean up and optimization, but I'll leave that as an exercise for the reader ;-) cheers, - Koen. From kvddrift at earthlink.net Fri Jul 22 18:26:29 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Fri, 22 Jul 2005 18:26:29 -0400 Subject: [Biococoa-dev] updated project in cvs In-Reply-To: <2FDEC4C0-E6FF-4F4D-94AC-BA743DE2422A@earthlink.net> References: <6B4EABE0-8ACE-4505-AD52-D511FBAC5AE0@earthlink.net> <8B02C750-87EA-49D1-93DE-72887D3F451E@earthlink.net> <230CDA50-7B64-4476-B7B4-5CDBA67B84B7@earthlink.net> <7991300C-9057-4836-AB1B-A58898DE95EA@gmail.com> <2FDEC4C0-E6FF-4F4D-94AC-BA743DE2422A@earthlink.net> Message-ID: <06BE373F-5C5E-4C34-BA60-F766B1E4FBA8@earthlink.net> On Jul 22, 2005, at 5:49 PM, Koen van der Drift wrote: > I actually added a category method for stringWithBytes to > BCStringUtils, but the compiler doesn't see it, so I added the > initWithBytes directly in the code. That's working now. I forgot to make it a class method in BCStringUtils. cheers, - Koen. From kvddrift at earthlink.net Sun Jul 24 17:58:17 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Sun, 24 Jul 2005 17:58:17 -0400 Subject: [Biococoa-dev] linker error Message-ID: <7ED15F41-B4AD-4BA4-A594-AFC9ED04A592@earthlink.net> Hi, Since I upgraded to Tiger and Xcode 2.1, my own app that uses BioCocoa gives the following linker errors: ld: warning NEXT_ROOT environment variable ignored because - syslibroot specified ld: Undefined symbols: .objc_class_name_BCSequence .objc_class_name_BCSequenceReader .objc_class_name_BCSymbolSet .objc_class_name_BCAminoAcid .objc_class_name_BCToolMassCalculator I removed and re-added the BioCocoa framework, but that didn't make any difference. Any clues how to fix this? thanks, - Koen. From kvddrift at earthlink.net Mon Jul 25 20:32:38 2005 From: kvddrift at earthlink.net (Koen van der Drift) Date: Mon, 25 Jul 2005 20:32:38 -0400 Subject: [Biococoa-dev] linker error In-Reply-To: <7ED15F41-B4AD-4BA4-A594-AFC9ED04A592@earthlink.net> References: <7ED15F41-B4AD-4BA4-A594-AFC9ED04A592@earthlink.net> Message-ID: <61945598-86C9-4A42-9F7A-09E4928726D8@earthlink.net> On Jul 24, 2005, at 5:58 PM, Koen van der Drift wrote: > I removed and re-added the BioCocoa framework, but that didn't make > any difference. > > Any clues how to fix this? > Ah - I re-added it, but didn't copy it to the 'Copy Files' build phase. - Koen.