[Biococoa-dev] rangeOfSubsequence fix

Koen van der Drift kvddrift at earthlink.net
Wed Sep 1 20:33:34 EDT 2004


Hi,

I posted this a few days ago, but got no answer yet - maybe everyone's 
busy ;)

Anyway, if you get the time to look into this, I'd appreciate it.


----------

>> What I am trying to get at is to see if it is possible to have a 
>> separate method to test the entrySymbol and selfSymbol that goes in 
>> BCNucleotideDNA and BCAminoAcid  (or BCSequenceDNA and 
>> BCSequenceProtein). Then we can keep all the rangeOfSubsequence in 
>> BCSequence.
> Perhaps it's indeed a good plan to let BCSymbol have the 
> isRepresentedBySymbol method that BCNucleotideDNA overrides to check 
> for ambiguous bases as well, that way BCSequence could have the 
> general methods.
>

Here is my suggestion, but I don't know if this will work. Make the 
following method in BCSymbol:


-(BOOL) isRepresentedBySymbol : (BCSymbol *)aSymbol
{
	return (self == aSymbol);
}


Then override this for BCNucleotideDNA to do all the ambiguity testing. 
Now the method - (NSRange) rangeOfSubsequence: (BCSequence *)entry 
withinRange: (NSRange)theLimit can be much simplified and only needs to 
be in BCSequence:

- (NSRange) rangeOfSubsequence: (BCSequence *)entry withinRange: 
(NSRange)theLimit {
     // do bounds checking
     if ( theLimit.location + theLimit.length >= [sequenceArray count] )
         return  NSMakeRange( NSNotFound, 0);

     // get the region to check
     NSArray *subSequence = [sequenceArray subarrayWithRange: theLimit];

     int loopCounter;
     BCSymbol *entrySymbol, *selfSymbol;
     BOOL haveMatch = NO;

     for ( loopCounter = 0 ; loopCounter < [subSequence count] - [entry 
length]  ; loopCounter++ ) {
         selfSymbol = [subSequence objectAtIndex: loopCounter];
         entrySymbol = [entry symbolAtIndex: 0];

	haveMatch = [selfSymbol isRepresentedBySymbol: entrySymbol];

	if ( haveMatch )
                 return NSMakeRange( loopCounter, [entry length] );

         }

     }
     // went through the whole sequence without finding anything
     return NSMakeRange( NSNotFound, 0);
}

The same can be done for - (NSArray *) rangesOfSubsequence: (BCSequence 
*)entry

There is no need for having the same code in a super and derived class, 
so it should all be removed from BCSequenceDNA.

However, just copy-paste the following to BCNucleotideDNA won't work, 
because 'entry' and 'subSequence' are part of the
rangesOfSubsequence method.


         if ( [selfBase isRepresentedByBase: entryBase]  || [entryBase 
isRepresentedByBase: selfBase] ) {
             haveMatch = YES;
             innerCounter = 1;
             // go through and compare each base
             while ( innerCounter < [entry length] ) {
                 selfBase = [subSequence objectAtIndex: loopCounter + 
innerCounter];
                 entryBase = [entry symbolAtIndex: innerCounter];

                 if ( ![selfBase isRepresentedByBase: entryBase]  && 
![entryBase isRepresentedByBase: selfBase] ) {
                     // exit without a match if we fail that comparison
                     innerCounter = [entry length];
                     haveMatch = NO;
                 }
                 innerCounter ++;
             }


I don't want to break the algorithm, so if anyone has a suggestion how 
to approach this, could you commit it or post it here?


thanks,


- Koen.




More information about the Biococoa-dev mailing list