[Biococoa-dev] More on BCSymbolSets
Koen van der Drift
kvddrift at earthlink.net
Sun Feb 27 20:05:42 EST 2005
Hi,
Again I was looking at the BCSymbolSet code to implement it more in the
BCSequence code. However with the new BCSequence class structure in
place I am not so sure yet how to do this. For instance, we have the
following method in each subclass:
- (id) initWithString:(NSString *)entry
skippingUnknownSymbols:(BOOL)skipFlag;
I guess these are intended to be the designated initializer, although
they have not been labeled as such in all classes. Now in BCSymbolSet
we have the following (eg for DNA):
dnaStrictSymbolSet (for C G T A) and dnaSymbolSet (for all possible
nucleotides, including the ambiguous ones).
Similar symbolsets are available for the other sequence types. Both
symbolsets are possible in the method above, the skipFlag is not
related to either symbolset. So what I can do is, is to test
immediately for ambiguous symbols when creating the sequence (using
containsAmbiguousSymbols), and based on that set the appropriate
symbolset. Or even, to avoid a double iteration, test immediately for
isCompoundSymbol when each symbol is added.
I think this code should only go in the designated initializer, because
that should be called by all other initializers. Would this be a
reasonable approach?
Then of course we have the 'unknown symbols' flag. I still am not sure
what the purpose of this is. Is it to prevent illegal characters to be
converted to a symbol. This could happen if the string contains
numbers, or other characters not defined to be symbols. I noticed that
the implementation for the skip flag is slightly different in the code
for proteins versus that for DNA/RNA.
For proteins it looks like:
if ( (skipFlag==NO) || (aminoAcid!=[BCAminoAcid undefined]) )
[tempSequence addObject: aminoAcid];
For DNA/RNA it looks like:
if ( aBase != [BCNucleotideDNA undefined] )
[tempSequence addObject: aBase];
else {
if ( !skipFlag )
[tempSequence addObject: [BCNucleotideDNA undefined]];
The protein adds the aminoAcid if skipFlag is NO, the DNA/RNA adds an
undefined symbol. I guess we should settle on one, anyone has a
preference?
thanks,
- Koen.
More information about the Biococoa-dev
mailing list