[Biococoa-dev] Demo App

John Timmer jtimmer at bellatlantic.net
Fri Aug 27 16:39:32 EDT 2004

>> Thanks - I found the mistakes in the .plist, and things should work
>> fine
>> now. Incidentally, a 2.4 Mb BAC took about 46 seconds to reverse
>> complement.
> Great! That's pretty rapid! What system are you on John?
A 1.33GHz G4 laptop.  I'm not sure if it stressed the disk at all, so I'd
imagine it was more a function of RAM access and processor, in which case
this is an above average machine.

> I already did the very nice and exiting work (ahum) of creating such a
> plist for EnzymeX, so we have this one already ;-)
> BCCodons express their sequence in the BCTokens right?
Could you send me a copy of the .plist?  I've been debating between a
flatfile with all possible combinations and a tree structure with keys that
are BCSymbols themselves, which should allow us to use ambiguous bases more

To explain the tree option in detail:  you simply enumerate the keys and
query each one as to whether it represents the first base.  If it does, you
grab the dictionary it keys for, and repeat the process with the second
base.  On the third base of the codon, the dictionary simply contains the
answer - in the case of a translation, the amino acid.  If it fails at any
point, it returns undefined.

This should cut down on the number of items we have to put in the dictionary
considerably, and provide a translation even if the sequence isn't high
quality.  Plus, I already know how to populate an object from text
references thanks to the nucleotide experience.

> We could have two different methods for translation to either RNA or
> protein. We should also take species specific translation into account,
> that's the reason for geneticcode objects. We can have a number of
> codes already predefined like [BCGeneticCode standardCode] as a
> classmethod.
I was thinking of making a single generic method that would handle all
translations, but I guess there's only going to be a few, so specialized
methods make more sense.

> I was thinking a bit about this as well yesterday, and came of with the
> following problem; how do we return multiple frames?
> I you do a translateDNASequence: usingCode: (BCGeneticCode *)code
> inFrame: you just return a BCSequenceProtein
> But what if you want all frames, or all forward frames, do we return a
> dictionary of BCSequenceProteins with the frame as key?
> Finally, let's define how we call each frame: -3, -2, -1, +1, +2, +3?
If a method can return more than one result, clearly it should return an
array.  As for frames, I think the non-zero integers are the way to go - we
should try to make usage familiar to biologists (unless it's too difficult
or annoying to do so ;).



This mind intentionally left blank

More information about the Biococoa-dev mailing list