[Biococoa-dev] Sequence Structure

Koen van der Drift kvddrift at earthlink.net
Wed Jul 13 09:19:30 EDT 2005


Hi,

After all the talking in the last week or so, I felt like coding  
again was playing last night with a possible new BCSequence-only  
structure. It will also include the char string and NSData ivars. For  
starters, I want to do just the immutable version, of course we can  
add stuff for the mutable sequence too later.  I need to read the  
chat log again to see if you guys came up with a good solution for that.

However, so far I have:

BCSequence
     const char *sequence;
     NSData *sequenceData;
     NSArray *symbolArray;
     BCSymbolSet *symbolSet;

We can re-use most of the methods that are now in BCSequence and  
BCAbstractSequence, including the code that guesses the sequence type  
if there is no symbolset defined.  I am not sure if we also should  
add the BCSequenceType back in there. I think the symbolset is enough.

Let me know what you guys think of this, and if this is a good way  
forward.  In order not to screw up the project I won't commit  
anything until we all agree on this (or another) approach. I will  
make some code availabe for download, though.

cheers,

- Koen.


On Jul 11, 2005, at 5:01 PM, Charles Parnot wrote:

> I thought a color version of the chat might be easier to follow
> charles = purple
> alex =green
>
>
> ...
> ok, let's get started on the NSMUtableData?
>
> lots of discussion and kind of stuck right
>
> So what I mean is: you return an NSMutableData, the compiler sees  
> an NSData, so you can't modify it.. This is good, but...
>
> yes, that's what I had in mind
>
> but the BCSequence might modify the object later
>
> how?
>
> that was my point
>
> the immutable subclass should alter the data
>
> none of its methods should
>
> sorry should not
>
> The mutable class can modify the ivar, right?
>
> can modify the content of the ivar, I should say
>
> yes, the mutable class
>
> but the mutable class should also return a mutable data object
>
> so the muutable class returns the poiinter to that ivar, right
>
> yes
>
> OK, what if the mutable class returns an NSData (from the header)
>
> the immutable class as well, only casted to NSData
>
> can't we override the method to be typed as mutabledata?
>
> Is there a '-(NSData *)data' method in the mutable class header?
>
> that was the only question I had
>
> 12:15 PM
> the question was whether you could override: '-(NSData *)data
>
> with '-(NSMutableData *)data
>
> I'm not sure
>
> no you can't
>
> never tried
>
> I tried
>
> then we have a problem
>
> you get a compiler warning
>
> you have to do '-(id)data'
>
> yep
>
> if you look at the NSArray/NSMuutableArray headers,...
>
> you have '+(id)array'
>
> aha
>
> this is because you cant' have +(NSArray *)array
>
> and +(NSMutableArray)array
>
> i get the idea
>
> We should have '-(NSData *)data' for both mut/immut and...
>
> add the method '-(NSMUtableData *)mutableData' to the mutable one
>
> or not
>
> hmm, yes and no
>
> in any case, the object returned by '-data' should be immutable and  
> not just cast to immutable
>
> in fact, now that I think of it
>
> yes?
>
> oh no, john doesn't like the idea of return an nsdata that in  
> reality is an nsmutabledata right
>
> I still don't see the problem
>
> as we don't allow editing of the mutable array directly,
>
> you ONLY need to publish the NSData
>
> if your mutable class returns a pointer to its mutable ivar...
>
> and tyhe user thinks it gets an immutable data...
>
> because remember that ALL editing should go through methods
>
> we shouldn't allow editing of the array directly!!
>
> then the user might keep that object around thinking it won't change
>
> that would make syncing impossible
>
> when it fact it will change as the sequence is edited
>
> true
>
> 12:20 PM
> your right
>
> The user WILL NOT edit the NSData but will see it changed!!!
>
> true
>
> this was my point!!! Yeah, you got it??
>
> yep
>
>
> so now how to solve it
>
> basically we need the same approach as NSArray NSMutableArray
>
> There is only one way to solve it: return a true NSData by copying  
> it...
>
> how did they solve it then?
>
> that would make it slow
>
> and especially the immutable class is added for performance reasons!
>
> there is another solution
>
> wait...wait...
>
> the immutable class don't need to copy it, because it won't change
>
> true
>
> only the mutable class should copy it
>
> yes,
>
> but what you are saying is to have the mutableversion have an extra  
> (private) ivar to store the data
>
> to improve performance, like I said in the email, you could still  
> retrurn the NSMutableData...
>
> sorry I was following on previous stuff
>
> leaving the other one unused?
>
> let me answer your question
>
> sorry
>
> no, we don't need an extra ivar.
>
> pfew
>
> We just have to be careful inside the class implementation
>
> the ivar can be a NSMutableData for the compiler, but we would in  
> fact use NSData for the immutable
>
> of course, if we call a mutability method on it, we get a runtime  
> exception, but it should never happen if we a re A BIT careful
>
> i get it
>
> now, back to above.
>
> yes
>
> to improve performance in mutable sequence, like I said in the  
> email, you could still retrurn the NSMutableData...
>
> 12:25 PM
> but also have a flag to say: next time we return the data or mutate  
> the seq...
>
> we can't use the current object pointed by the ivar. Somebody else  
> is using it as NSData
>
> so the flag say: netx time, copy it
>
> if next time never happens, no copy!
>
> hmm, that doesn't sound to nice I think, although functional
>
> performance trick are often anti-good code
>
> true
>
> now, back to the question i had earlier
>
> this is a 'lazy' copy
>
> ok, back to it
>
> how did apple solve the mutable vs immutable code?
>
> in which case?
>
> well, they have NSArray in mutable and immutable form
>
> they use a flag internally. I found that on the cocoadev mailinglist
>
> and NSMutableArray seems to be a subclass of NSArray
>
> these are just header tricks
>
> aha
>
> placeholder classes
>
> just like I did with BCSequence and my recent email
>
> you are saying they are using that lazy copy trick
>
> wowowwowowo
>
> lazy copy trick??
>
> haha
>
> sorry
>
> ah, ok, yes, they are!
>
> sort of
>
> they only make a real copy when it is a mutable instance
>
> it is probably not lazy, though
>
> not sure
>
> so the implementation of '-copy' is
>
> they do a real copy if flag = mutable
>
> otherwise just copy the pointer
>
> aha
>
> that makes sense
>
> and i don't think they defer the real copy in the -copy method
>
> but the situation is different
>
> the user asks for a copy
>
> it should expect to be done immediately
>
> 12:30 PM
> performance can't be great
>
> yes, you are right
>
> if you ask a copy, you know what you are doing
>
> the thing is that I'm afraid copying is out of the question anyway
>
> don't know...
>
> remember, the user wants direct access to the data
>
> serge seemed to say that this is not that expensive
>
> which can be 300Mb
>
> well, yeah
>
> well, then he should use immutable sequences perhaps
>
> but then the user should use the BCSequence methods
>
> yes, use immutable if you don't want to edit
>
> but then the user should use the BCSequence methods to edit
>
> well, in the end it's inevitable
>
> you don't want the data to change underneath
>
> yes! Inevitable is the word
>
> the concept of mutable/immutabnle is more subtle that it seems at  
> first
>
> this should be documented this discussion
>
> it is
>
> but coming back to a discussion
>
> we could copy and paste the whole chat?
>
> ok, back to the discussion
>
> imagine this to mixed with a discussion about 4 types of sub(sub) 
> classes
>
> mutable vs immutable is already difficult enough
>
> well, you read my email?
>
> i truly believe that symbolsets is our typing
>
> it was a bit complicated, no?
>
> yes
>
> too much\
>
> and remember last time we choose such an approach
>
> yes, I know...
>
> i almost had to phone you to ask how it worked
>
> complicated for the developer does not mean complicated for the  
> user necessarily
>
> i don't like omni graffle anymore
>
> true
>
> and complicated the first time does not mean you have to alwasy  
> rememeber how it works
>
> 12:35 PM
> if you never have to change it
>
> all fine
>
> anyway, at least you got one of my concern
>
> but the one-sequence-for-all is so much simpler (in interface  
> terms) that I think that will pay off big time
>
> probably, yes
>
> i'm willing to give up direct typing
>
> interesting turn of events!!
>
> yes!
>
> the wwdc has certainly created some storm
>
> well I'm really enthusiastic about the nsdata
>
> as storage
>
> it's cool!
>
> hafing typed sequences more for the 'expert' user could be fine too
>
> see my last email
>
> no other biox project has it, but I like the idea
>
> yes, if you could wrap it certainly!
>
> i kind of dropped of there
>
> typed sequence would be like the CFArray
>
> for more advanced user!
>
> but that would imply the typed once being the basis and the untyped  
> one the wrapper
>
> and I don't like that too much
>
> the otherway around is fine with me
>
> not necessaarily. I actually propose the oppsite, just as suggested  
> byu John
>
> yes, exactly totally agree, the other way is better,
>
> as it could live a separate life
>
> read the email
>
> looke at the omnigraffle thingie
>
> I thought to remember that from your email, i did read it
>
> OK, in the grpah, it is really separate, like a plug-in on the side
>
> I didn't get the CFarray analogy therefore
>
> bad analogy
>
> haha
>
> just something more hidden, less needed
>
> This is a better omnigraffle
>
> less friendly
>
> only one arrow
>
> That's good!
>
> and almost straight arrow
>
> but I just didn't feel like thinking too much about it
>
> first, John would like to go for it I guess
>
> 12:40 PM
> the concept is there. The implementation can be wrapper of placeholder
>
> and second, it's an add-on/plugin so could wait
>
> wrapper OR placeholder
>
> oups
>
> yes could wait or be there and don't care
>
> perfect, the only thing I would like to see is that it doesn't  
> require tricks in the "clean" one-for-all bcsequence system
>
> it does not
>
> perfect!
>
> Well, let's do it
>
> haha
>
> you agree the BCSequence header has ALL the methods?
>
> including -complement
>
> if I have just as much time as you, then we have an even bigger  
> problem
>
> although I would like to spend a few days if I had time
>
> i'm enthusiastic about the discussions, although heated
>
> yeah, i know about 'time';
>
> That would be the consequence of the approach
>
> all methods
>
> ok for the methods
>
> I think we should define a limited number of basis methods
>
> let's see what john thinks
>
> the rest will have to be tools
>
> that's the way it is
>
> but think in terms of strider
>
> yes, the tools/method line
>
> I was thinking of all basic editing simple transformations to be in  
> bcsequence
>
> yes, we had a discussiona bout that a few months ago
>
> and more complex things like translations, digestions, alignments  
> in tools
>
> that still holds
>
> yes, perfect
>
> although a basic translation method could be there as convenience  
> method
>
> should we copy paste this whole chat to the mailing list?
>
> i'm not sure
>
> it's quite arbitrary
>
> fine with me
>
> it is arbitrary, yes
>
> 12:45 PM
> I need to get back to work
>
> ok
>
> nice talking to you again, and thanks for making your point clear
>
> I get it now
>
> thanks for listening!!
>
> Have a nice day at work
>
> good night!
>
> I'll copy the discussion to the list
>
> thanks!
>
> thanks for the copy, don't make it lazy
>
> Cheers Charles,
>
> speak to you later
>
> cheers
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>




More information about the Biococoa-dev mailing list