[Biococoa-dev] Optimizations

Koen van der Drift kvddrift at earthlink.net
Sat Mar 26 20:38:35 EST 2005


Hi,

Looks very nice John. Would there be any problems to use BCSymbolSet 
instead of NSSet? Basically BCSymbolSet is a wrapper around NSSet, but 
it could give us some additional advantages over directly using an 
NSSet. Or will that slow down your code?


cheers,

- Koen.



On Mar 26, 2005, at 4:56 PM, John Timmer wrote:

> Okay, I changed the represents/representedBy collections from arrays 
> to sets
> and fixed the all the related calls to adjust to that.  It definitely 
> makes
> a difference - cuts the time about in half.  On a dual 1.8 G5, I got 
> the
> following results using a 6-mer searching a 1.2KB DNA sequence 50 
> times:
>
> 2005-03-26 16:38:51.902 Translation[8375] ambiguous finding took 
> -0.821547
> seconds
> 2005-03-26 16:38:52.775 Translation[8375] ambiguous old finding took
> -0.873676 seconds
> 2005-03-26 16:38:53.034 Translation[8375] strict finding took -0.258846
> seconds
> 2005-03-26 16:38:53.466 Translation[8375] strict old finding took 
> -0.431471
> seconds
>
> This is after catching two bugs, one in the old and one in the new 
> method.
> It was pretty funny - the old version kept coming in faster, so I knew 
> there
> had to be something wrong ;).
>
> For the curious, extrapolating from this single data point indicates 
> that
> the ambiguous search is faster than searching for each of its possible
> strict sequences as soon as the ambiguity can't be resolved into <4 
> strict
> sequences.
>
> Given the big boosts, I'm going to do the same for complements now - I
> expect that will significantly boost translation speeds.
>
> JT
> _______________________________________________
> This mind intentionally left blank
>
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>




More information about the Biococoa-dev mailing list