[Biococoa-dev] Sequence Structure

Alexander Griekspoor mek at mekentosj.com
Wed Jul 13 09:44:06 EDT 2005


Just a quick remark:

I thought that the idea was to have the NSData be the char array, so  
no separate const char*, that IS the NSData object already.
So to go over the list
> BCSequence
>     const char *sequence;
  -> is NSData so this one goes away, if you want access to the data,  
you would do: [sequenceData bytes]; which gives you the pointer
>     NSData *sequenceData;
>     NSArray *symbolArray;
  -> this one only if we decide to cache it, otherwise, this would be  
a method only. So what was the consensus again on this?
>    BCSymbolSet *symbolSet;

Cheers,
Alex


On 13-jul-2005, at 15:19, Koen van der Drift wrote:

> Hi,
>
> After all the talking in the last week or so, I felt like coding  
> again was playing last night with a possible new BCSequence-only  
> structure. It will also include the char string and NSData ivars.  
> For starters, I want to do just the immutable version, of course we  
> can add stuff for the mutable sequence too later.  I need to read  
> the chat log again to see if you guys came up with a good solution  
> for that.
>
> However, so far I have:
>
> BCSequence
>     const char *sequence;
>     NSData *sequenceData;
>     NSArray *symbolArray;
>     BCSymbolSet *symbolSet;
>
> We can re-use most of the methods that are now in BCSequence and  
> BCAbstractSequence, including the code that guesses the sequence  
> type if there is no symbolset defined.  I am not sure if we also  
> should add the BCSequenceType back in there. I think the symbolset  
> is enough.
>
> Let me know what you guys think of this, and if this is a good way  
> forward.  In order not to screw up the project I won't commit  
> anything until we all agree on this (or another) approach. I will  
> make some code availabe for download, though.
>
> cheers,
>
> - Koen.
>
>
> On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote:
>
>
>> I thought a color version of the chat might be easier to follow
>> charles = purple
>> alex =green
>>
>>
>> ...
>> ok, let's get started on the NSMUtableData?
>>
>> lots of discussion and kind of stuck right
>>
>> So what I mean is: you return an NSMutableData, the compiler sees  
>> an NSData, so you can't modify it.. This is good, but...
>>
>> yes, that's what I had in mind
>>
>> but the BCSequence might modify the object later
>>
>> how?
>>
>> that was my point
>>
>> the immutable subclass should alter the data
>>
>> none of its methods should
>>
>> sorry should not
>>
>> The mutable class can modify the ivar, right?
>>
>> can modify the content of the ivar, I should say
>>
>> yes, the mutable class
>>
>> but the mutable class should also return a mutable data object
>>
>> so the muutable class returns the poiinter to that ivar, right
>>
>> yes
>>
>> OK, what if the mutable class returns an NSData (from the header)
>>
>> the immutable class as well, only casted to NSData
>>
>> can't we override the method to be typed as mutabledata?
>>
>> Is there a '-(NSData *)data' method in the mutable class header?
>>
>> that was the only question I had
>>
>> 12:15 PM
>> the question was whether you could override: '-(NSData *)data
>>
>> with '-(NSMutableData *)data
>>
>> I'm not sure
>>
>> no you can't
>>
>> never tried
>>
>> I tried
>>
>> then we have a problem
>>
>> you get a compiler warning
>>
>> you have to do '-(id)data'
>>
>> yep
>>
>> if you look at the NSArray/NSMuutableArray headers,...
>>
>> you have '+(id)array'
>>
>> aha
>>
>> this is because you cant' have +(NSArray *)array
>>
>> and +(NSMutableArray)array
>>
>> i get the idea
>>
>> We should have '-(NSData *)data' for both mut/immut and...
>>
>> add the method '-(NSMUtableData *)mutableData' to the mutable one
>>
>> or not
>>
>> hmm, yes and no
>>
>> in any case, the object returned by '-data' should be immutable  
>> and not just cast to immutable
>>
>> in fact, now that I think of it
>>
>> yes?
>>
>> oh no, john doesn't like the idea of return an nsdata that in  
>> reality is an nsmutabledata right
>>
>> I still don't see the problem
>>
>> as we don't allow editing of the mutable array directly,
>>
>> you ONLY need to publish the NSData
>>
>> if your mutable class returns a pointer to its mutable ivar...
>>
>> and tyhe user thinks it gets an immutable data...
>>
>> because remember that ALL editing should go through methods
>>
>> we shouldn't allow editing of the array directly!!
>>
>> then the user might keep that object around thinking it won't change
>>
>> that would make syncing impossible
>>
>> when it fact it will change as the sequence is edited
>>
>> true
>>
>> 12:20 PM
>> your right
>>
>> The user WILL NOT edit the NSData but will see it changed!!!
>>
>> true
>>
>> this was my point!!! Yeah, you got it??
>>
>> yep
>>
>>
>> so now how to solve it
>>
>> basically we need the same approach as NSArray NSMutableArray
>>
>> There is only one way to solve it: return a true NSData by copying  
>> it...
>>
>> how did they solve it then?
>>
>> that would make it slow
>>
>> and especially the immutable class is added for performance reasons!
>>
>> there is another solution
>>
>> wait...wait...
>>
>> the immutable class don't need to copy it, because it won't change
>>
>> true
>>
>> only the mutable class should copy it
>>
>> yes,
>>
>> but what you are saying is to have the mutableversion have an  
>> extra (private) ivar to store the data
>>
>> to improve performance, like I said in the email, you could still  
>> retrurn the NSMutableData...
>>
>> sorry I was following on previous stuff
>>
>> leaving the other one unused?
>>
>> let me answer your question
>>
>> sorry
>>
>> no, we don't need an extra ivar.
>>
>> pfew
>>
>> We just have to be careful inside the class implementation
>>
>> the ivar can be a NSMutableData for the compiler, but we would in  
>> fact use NSData for the immutable
>>
>> of course, if we call a mutability method on it, we get a runtime  
>> exception, but it should never happen if we a re A BIT careful
>>
>> i get it
>>
>> now, back to above.
>>
>> yes
>>
>> to improve performance in mutable sequence, like I said in the  
>> email, you could still retrurn the NSMutableData...
>>
>> 12:25 PM
>> but also have a flag to say: next time we return the data or  
>> mutate the seq...
>>
>> we can't use the current object pointed by the ivar. Somebody else  
>> is using it as NSData
>>
>> so the flag say: netx time, copy it
>>
>> if next time never happens, no copy!
>>
>> hmm, that doesn't sound to nice I think, although functional
>>
>> performance trick are often anti-good code
>>
>> true
>>
>> now, back to the question i had earlier
>>
>> this is a 'lazy' copy
>>
>> ok, back to it
>>
>> how did apple solve the mutable vs immutable code?
>>
>> in which case?
>>
>> well, they have NSArray in mutable and immutable form
>>
>> they use a flag internally. I found that on the cocoadev mailinglist
>>
>> and NSMutableArray seems to be a subclass of NSArray
>>
>> these are just header tricks
>>
>> aha
>>
>> placeholder classes
>>
>> just like I did with BCSequence and my recent email
>>
>> you are saying they are using that lazy copy trick
>>
>> wowowwowowo
>>
>> lazy copy trick??
>>
>> haha
>>
>> sorry
>>
>> ah, ok, yes, they are!
>>
>> sort of
>>
>> they only make a real copy when it is a mutable instance
>>
>> it is probably not lazy, though
>>
>> not sure
>>
>> so the implementation of '-copy' is
>>
>> they do a real copy if flag = mutable
>>
>> otherwise just copy the pointer
>>
>> aha
>>
>> that makes sense
>>
>> and i don't think they defer the real copy in the -copy method
>>
>> but the situation is different
>>
>> the user asks for a copy
>>
>> it should expect to be done immediately
>>
>> 12:30 PM
>> performance can't be great
>>
>> yes, you are right
>>
>> if you ask a copy, you know what you are doing
>>
>> the thing is that I'm afraid copying is out of the question anyway
>>
>> don't know...
>>
>> remember, the user wants direct access to the data
>>
>> serge seemed to say that this is not that expensive
>>
>> which can be 300Mb
>>
>> well, yeah
>>
>> well, then he should use immutable sequences perhaps
>>
>> but then the user should use the BCSequence methods
>>
>> yes, use immutable if you don't want to edit
>>
>> but then the user should use the BCSequence methods to edit
>>
>> well, in the end it's inevitable
>>
>> you don't want the data to change underneath
>>
>> yes! Inevitable is the word
>>
>> the concept of mutable/immutabnle is more subtle that it seems at  
>> first
>>
>> this should be documented this discussion
>>
>> it is
>>
>> but coming back to a discussion
>>
>> we could copy and paste the whole chat?
>>
>> ok, back to the discussion
>>
>> imagine this to mixed with a discussion about 4 types of sub(sub) 
>> classes
>>
>> mutable vs immutable is already difficult enough
>>
>> well, you read my email?
>>
>> i truly believe that symbolsets is our typing
>>
>> it was a bit complicated, no?
>>
>> yes
>>
>> too much\
>>
>> and remember last time we choose such an approach
>>
>> yes, I know...
>>
>> i almost had to phone you to ask how it worked
>>
>> complicated for the developer does not mean complicated for the  
>> user necessarily
>>
>> i don't like omni graffle anymore
>>
>> true
>>
>> and complicated the first time does not mean you have to alwasy  
>> rememeber how it works
>>
>> 12:35 PM
>> if you never have to change it
>>
>> all fine
>>
>> anyway, at least you got one of my concern
>>
>> but the one-sequence-for-all is so much simpler (in interface  
>> terms) that I think that will pay off big time
>>
>> probably, yes
>>
>> i'm willing to give up direct typing
>>
>> interesting turn of events!!
>>
>> yes!
>>
>> the wwdc has certainly created some storm
>>
>> well I'm really enthusiastic about the nsdata
>>
>> as storage
>>
>> it's cool!
>>
>> hafing typed sequences more for the 'expert' user could be fine too
>>
>> see my last email
>>
>> no other biox project has it, but I like the idea
>>
>> yes, if you could wrap it certainly!
>>
>> i kind of dropped of there
>>
>> typed sequence would be like the CFArray
>>
>> for more advanced user!
>>
>> but that would imply the typed once being the basis and the  
>> untyped one the wrapper
>>
>> and I don't like that too much
>>
>> the otherway around is fine with me
>>
>> not necessaarily. I actually propose the oppsite, just as  
>> suggested byu John
>>
>> yes, exactly totally agree, the other way is better,
>>
>> as it could live a separate life
>>
>> read the email
>>
>> looke at the omnigraffle thingie
>>
>> I thought to remember that from your email, i did read it
>>
>> OK, in the grpah, it is really separate, like a plug-in on the side
>>
>> I didn't get the CFarray analogy therefore
>>
>> bad analogy
>>
>> haha
>>
>> just something more hidden, less needed
>>
>> This is a better omnigraffle
>>
>> less friendly
>>
>> only one arrow
>>
>> That's good!
>>
>> and almost straight arrow
>>
>> but I just didn't feel like thinking too much about it
>>
>> first, John would like to go for it I guess
>>
>> 12:40 PM
>> the concept is there. The implementation can be wrapper of  
>> placeholder
>>
>> and second, it's an add-on/plugin so could wait
>>
>> wrapper OR placeholder
>>
>> oups
>>
>> yes could wait or be there and don't care
>>
>> perfect, the only thing I would like to see is that it doesn't  
>> require tricks in the "clean" one-for-all bcsequence system
>>
>> it does not
>>
>> perfect!
>>
>> Well, let's do it
>>
>> haha
>>
>> you agree the BCSequence header has ALL the methods?
>>
>> including -complement
>>
>> if I have just as much time as you, then we have an even bigger  
>> problem
>>
>> although I would like to spend a few days if I had time
>>
>> i'm enthusiastic about the discussions, although heated
>>
>> yeah, i know about 'time';
>>
>> That would be the consequence of the approach
>>
>> all methods
>>
>> ok for the methods
>>
>> I think we should define a limited number of basis methods
>>
>> let's see what john thinks
>>
>> the rest will have to be tools
>>
>> that's the way it is
>>
>> but think in terms of strider
>>
>> yes, the tools/method line
>>
>> I was thinking of all basic editing simple transformations to be  
>> in bcsequence
>>
>> yes, we had a discussiona bout that a few months ago
>>
>> and more complex things like translations, digestions, alignments  
>> in tools
>>
>> that still holds
>>
>> yes, perfect
>>
>> although a basic translation method could be there as convenience  
>> method
>>
>> should we copy paste this whole chat to the mailing list?
>>
>> i'm not sure
>>
>> it's quite arbitrary
>>
>> fine with me
>>
>> it is arbitrary, yes
>>
>> 12:45 PM
>> I need to get back to work
>>
>> ok
>>
>> nice talking to you again, and thanks for making your point clear
>>
>> I get it now
>>
>> thanks for listening!!
>>
>> Have a nice day at work
>>
>> good night!
>>
>> I'll copy the discussion to the list
>>
>> thanks!
>>
>> thanks for the copy, don't make it lazy
>>
>> Cheers Charles,
>>
>> speak to you later
>>
>> cheers
>>
>> _______________________________________________
>> Biococoa-dev mailing list
>> Biococoa-dev at bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>>
>>
>>
>
>
>

*********************************************************
                     ** Alexander Griekspoor **
*********************************************************
               The Netherlands Cancer Institute
               Department of Tumorbiology (H4)
          Plesmanlaan 121, 1066 CX, Amsterdam
                   Tel:  + 31 20 - 512 2023
                   Fax:  + 31 20 - 512 2029
                   E-mail: a.griekspoor at nki.nl
             AIM: mekentosj at mac.com
               Web: http://www.mekentosj.com

                  EnzymeX - To cut or not to cut
              http://www.mekentosj.com/enzymex

*********************************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20050713/c6b317e4/attachment.html>


More information about the Biococoa-dev mailing list