[Biococoa-dev] rangeOfSubsequence fix
Koen van der Drift
kvddrift at earthlink.net
Wed Sep 1 20:33:34 EDT 2004
Hi,
I posted this a few days ago, but got no answer yet - maybe everyone's
busy ;)
Anyway, if you get the time to look into this, I'd appreciate it.
----------
>> What I am trying to get at is to see if it is possible to have a
>> separate method to test the entrySymbol and selfSymbol that goes in
>> BCNucleotideDNA and BCAminoAcid (or BCSequenceDNA and
>> BCSequenceProtein). Then we can keep all the rangeOfSubsequence in
>> BCSequence.
> Perhaps it's indeed a good plan to let BCSymbol have the
> isRepresentedBySymbol method that BCNucleotideDNA overrides to check
> for ambiguous bases as well, that way BCSequence could have the
> general methods.
>
Here is my suggestion, but I don't know if this will work. Make the
following method in BCSymbol:
-(BOOL) isRepresentedBySymbol : (BCSymbol *)aSymbol
{
return (self == aSymbol);
}
Then override this for BCNucleotideDNA to do all the ambiguity testing.
Now the method - (NSRange) rangeOfSubsequence: (BCSequence *)entry
withinRange: (NSRange)theLimit can be much simplified and only needs to
be in BCSequence:
- (NSRange) rangeOfSubsequence: (BCSequence *)entry withinRange:
(NSRange)theLimit {
// do bounds checking
if ( theLimit.location + theLimit.length >= [sequenceArray count] )
return NSMakeRange( NSNotFound, 0);
// get the region to check
NSArray *subSequence = [sequenceArray subarrayWithRange: theLimit];
int loopCounter;
BCSymbol *entrySymbol, *selfSymbol;
BOOL haveMatch = NO;
for ( loopCounter = 0 ; loopCounter < [subSequence count] - [entry
length] ; loopCounter++ ) {
selfSymbol = [subSequence objectAtIndex: loopCounter];
entrySymbol = [entry symbolAtIndex: 0];
haveMatch = [selfSymbol isRepresentedBySymbol: entrySymbol];
if ( haveMatch )
return NSMakeRange( loopCounter, [entry length] );
}
}
// went through the whole sequence without finding anything
return NSMakeRange( NSNotFound, 0);
}
The same can be done for - (NSArray *) rangesOfSubsequence: (BCSequence
*)entry
There is no need for having the same code in a super and derived class,
so it should all be removed from BCSequenceDNA.
However, just copy-paste the following to BCNucleotideDNA won't work,
because 'entry' and 'subSequence' are part of the
rangesOfSubsequence method.
if ( [selfBase isRepresentedByBase: entryBase] || [entryBase
isRepresentedByBase: selfBase] ) {
haveMatch = YES;
innerCounter = 1;
// go through and compare each base
while ( innerCounter < [entry length] ) {
selfBase = [subSequence objectAtIndex: loopCounter +
innerCounter];
entryBase = [entry symbolAtIndex: innerCounter];
if ( ![selfBase isRepresentedByBase: entryBase] &&
![entryBase isRepresentedByBase: selfBase] ) {
// exit without a match if we fail that comparison
innerCounter = [entry length];
haveMatch = NO;
}
innerCounter ++;
}
I don't want to break the algorithm, so if anyone has a suggestion how
to approach this, could you commit it or post it here?
thanks,
- Koen.
More information about the Biococoa-dev
mailing list