[Biococoa-dev] Optimizations

John Timmer jtimmer at bellatlantic.net
Mon Mar 28 11:48:35 EST 2005


> Hi,
> 
> Looks very nice John. Would there be any problems to use BCSymbolSet
> instead of NSSet? Basically BCSymbolSet is a wrapper around NSSet, but
> it could give us some additional advantages over directly using an
> NSSet. Or will that slow down your code?
> 

Just for context, the original timing for a 50 repeat find of a 6-mer with
two ambiguous bases in a 1.2Kb sequence was around 0.821547 seconds.

I tried replacing the NSSets with BCSymboSets on the same machine I did my
previous timings with.  After a couple of runs under the same conditions, it
appears that doing so adds about .15 sec, or somewhere between 15% and 20%
to the execution time.  Presumably, this is all spent message sending,
though I haven't checked with Shark to confirm this.

In contrast, keeping it as an NSSet and using the CoreFoundation CFSet
function that's the equivalent of "containsObject" cut the time by about .2
seconds, knocking the time down to about .62 seconds total.  Again, I'll
assume this is entirely due to ditching the message sending overhead, since
it's only changing a single line of code.

Given those numbers, my preference would be to stick with NSSet and use the
CF function.  Since NSSets and BCSymbolSets are very easy to interchange and
we already have methods in place to return either, there's really no
difference externally, and internally, we're talking about a > 30%
difference in performance.  Unless somebody disagrees, I'll commit those
changes later today.

JT

_______________________________________________
This mind intentionally left blank





More information about the Biococoa-dev mailing list