[Biococoa-dev] Design question

Alexander Griekspoor mek at mekentosj.com
Thu Aug 5 14:10:29 EDT 2004


Hi John and others,

My thoughts on this issue, starting with replying the points you  
brought up.

> I could see three options:
> The sites could be handled as an NSIndexSet, but that won't work for  
> the
> ORFs and is 10.3 only.

As we discussed, 10.2 will be the target OS, I guess that leaves this  
option.

> Another option would be to store Ranges as NSValues and return an  
> array of
> them.  This would be very convenient internally, but wouldn't allow
> convenient saving of the information, since NSValues would need to be
> encoded before saving.
True, but wait to see my answer on that later.

> The final thing would be to add a category to NSDictionary that would  
> add
> the methods "storeRangeInDictionary" and "retrieveRangeFromDictionary"  
> that
> would just make length and location keys in a dictionary.
I think it is handy to add a number of these categories anyway, but  
then name them according to the general scheme, which would be  
-rangeForKey: and -setRange:forKey: Similarly, I often have to add  
-rectForKey, colorForKey myself, it would be nice to add all these to  
the framework.

> Does anyone have a preference about how to handle it?
My option is not there ;-) Option 4 would be the thing I tend to end up  
with everytime I tackle these kind of problems, and is a solution both  
very Cocoa like, and very BioPerl/BioJava like, that is to model  
everything to classes. My opionion is to write a class for each  
"module" there is in the real world. In this case, a "restriction  
enzyme class", a "digestion class", a "cut class", a "dna fragment"  
class (which can be just the general "dna sequence class") etc. I have  
used it in the upcoming version of EnzymeX and have attached the header  
files of to classes to give you an idea. The nice thing is that it's  
now easy to extend things and model things. For instance, to do the  
drawing of all cuts in a plasmid I run into the problem that I had (to)  
many enzymes drawing overlappingly in multiple cloning sites. My  
multiple cloning site class was the solution, as I could separate the  
"cuts" into in- and outside multiple cloning sites and draw both  
categories differently. Also the attached digest class (which  
represents a cut fragment) is very elegant as it allows sorting on  
length, cutposition etc. The nice thing is that you can add these  
things to arrays and dictionaries as well.

To further comment on the attached classes. Of course NSCoding support  
should be added, which is simple and add direct reading and writing of  
arrays and dictionaries that contain these objects. Also, I should  
better stick to keyvalue coding (it's 90% now) as it is the basis for  
bindings. Don't look to much to the class nomenclature here, we should  
clearly pick better names. Finally, see how easy it is to add stuff  
like sorting for instance: [NSArray sortedArrayUsingSelector:  
@selector(sortResultsOnLengthDescending:)]; returns an array with all  
fragments sorted on size.
Also, I have added the - (NSString *) description; method which makes  
debugging so much easier. The implementation:
- (NSString *) description{
     if([self nrOfCuts] > 0) return [NSString stringWithFormat:  
@"EXMapMCS: %d %@ ---- %d %@, %d cuts", [[self firstCut]position],  
[[self firstCut]enzyme], [[self lastCut]position], [[self  
lastCut]enzyme], [self nrOfCuts]];
     else return @"EXMapMCS: empty";
}
Now you can just call on an array: NSLog(@"%@", [myArrayWithFragments  
objectAtIndex: 0]);
and it shows you all details of the first object in the array.

One could think implementing many methods in these objects like   
stringRepresentation: for a DNA sequence class, or -complement;

Finally, the really nice thing which saved a lot of time for EnzymeX  
was adding the
- (BOOL)hitTest: (int)pos;
method. I wanted users to be able to select a fragment or cut by  
clicking. Now it's simply a matter of asking each fragment in the array  
in a loop if it is "hit", and there you go. Of course, we should give  
the method a different, more general name, like -containsPosition; or  
startsAt:
These are the socalled convenience methods, which make life, well a  
convenient ;-)

All in all, I think this option will give us a versatile and super  
flexible/extensible framework. Everyone can add small, but handy  
methods without worrying about breaking stuff. Moreover, it allows easy  
passing of objects throughout the framework. For instance, say we  
wanted to add alignments, as input we would use our own objects. The  
seqio controller knows how to write these/convert these, etc. This also  
leaves plenty of room for attaching bindings, exporting to indexsets  
etc.

This is what in my opinion the BCFoundation should look like, and I  
think is something also the other bio... frameworks have choosen. Many  
classes representing all kinds of instances we use. This is also the  
way Cocoa's foundation works (nsstring, nsrect, nsdictionary, nsarray,  
nsrange etc).

The only question left is, do we need a controllerobject for certain  
tasks. For instance, a "Alignment controller", a "Digestion  
controller"? I think that that wouldn't be such a bad idea. Normally  
this goes into your application's code, like EnzymeX is doing all  
drawing, supervises the hit test, digestion, etc. As we are writing a  
framework, we can't rely on other people writing this, it would simply  
be to much work to figure out how we, the biococoa developers, would  
like to see how they should do it. Therefore, we have to do that  
ourselves as well, to use the framework one would only need to  
instantiate a controller which does all the logic behind the scene. The  
user would just have to tell the "Digestion controller the sequence,  
enzymes etc, and he would get back a fragment array. That is what my  
ideal world would look like.

That leaves the question, how to implement this all, where to start. I  
think we should try to get a basic foundation first, the simple  
objects. We should also decide what to call them, perhaps we could fill  
in the scheme John made in further detail. Maybe for compatibility  
reasons largely based on what BioJava/Perl have done, of course only if  
that makes sense, and we should keep the Cocoa names as templates. I am  
willing to prototype a few of these classes based on the attached  
header below, if you guys think that's a proper basis of course. If  
that's done, we can start adding methods like the ones John was  
describing, which should be rewritten to use our BCFoundation objects.  
The nice thing is that others can safely add methods to the objects  
without breaking the code John add in. Also very nice, John can add  
convenience methods in those classes as well as he finds out that these  
can help simplifying his ORF finding method.

Well, enough writing, I'm sure this enough to comment on for everyone  
;-) I'm curious what you guys think of this, let me know if you think  
differently or have something to add. After that we can talk about the  
practical implementation...
Cheers,
Alex

//
//  EXMapDigest.h
//  EnzymeX
//
//  Created by Alexander Griekspoor on Fri Nov 07 2003.
//  Copyright (c) 2003 __MyCompanyName__. All rights reserved.
//

#import <Foundation/Foundation.h>


@interface EXMapDigest : NSObject {

      
// 
======================================================================== 
===
     #pragma mark --- VARIABLES & PROPERTIES
      
// 
======================================================================== 
===

     int start;
     int end;
     int startcut5;
     int endcut5;
     int constructlength;

     NSString *startEnzyme;
     NSString *endEnzyme;

     int startCutPosition;
     int endCutPosition;

     BOOL isSelected;
}


// 
======================================================================== 
===
#pragma mark --- INIT & DEALLOC
// 
======================================================================== 
===

- (id)init;
- (void)dealloc;


// 
======================================================================== 
===
#pragma mark --- ACCESSOR METHODS
// 
======================================================================== 
===

- (int)start;
- (void)setStart:(int)newStart;

- (int)end;
- (void)setEnd:(int)newEnd;

- (int)startcut5;
- (void)setStartcut5:(int)newStartcut5;

- (int)endcut5;
- (void)setEndcut5:(int)newEndcut5;

- (int)constructlength;
- (void)setConstructlength:(int)newConstructlength;

- (NSString *)startEnzyme;
- (void)setStartEnzyme:(NSString *)newStartEnzyme;

- (NSString *)endEnzyme;
- (void)setEndEnzyme:(NSString *)newEndEnzyme;

- (BOOL)isSelected;
- (void)setIsSelected:(BOOL)newIsSelected;

// 
======================================================================== 
===
#pragma mark --- GENERAL METHODS
// 
======================================================================== 
===

- (NSString *) description;

- (int)startCutPosition;
- (int)endCutPosition;

- (int)length;
- (float)percentage;

- (BOOL)hitTest: (int)pos;


// 
======================================================================== 
===
#pragma mark --- UTILITY & CONVERTER METHODS
// 
======================================================================== 
===

- (NSComparisonResult)sortResultsOnPositionDescending:(EXMapDigest*)  
dig;
- (NSComparisonResult)sortResultsOnPositionAscending:(EXMapDigest*) dig;

- (NSComparisonResult)sortResultsOnLengthDescending:(EXMapDigest*) dig;
- (NSComparisonResult)sortResultsOnLengthAscending:(EXMapDigest*) dig;


@end

//
//  EXMapMCS.h
//  EnzymeX
//
//  Created by Alexander Griekspoor on Fri Nov 07 2003.
//  Copyright (c) 2003 __MyCompanyName__. All rights reserved.
//

#import <Foundation/Foundation.h>

@class EXMapCut;

@interface EXMapMCS : NSObject {

      
// 
======================================================================== 
===
     #pragma mark --- VARIABLES & PROPERTIES
      
// 
======================================================================== 
===

     NSMutableArray *cuts;
     NSRect rect;
}


// 
======================================================================== 
===
#pragma mark --- INIT & DEALLOC
// 
======================================================================== 
===

- (id)init;
- (void)dealloc;


// 
======================================================================== 
===
#pragma mark --- ACCESSOR METHODS
// 
======================================================================== 
===

- (NSMutableArray *)cuts;
- (void)setCuts:(NSMutableArray *)newCuts;

- (NSRect)rect;
- (void)setRect:(NSRect)newRect;

- (int)nrOfCuts;
- (EXMapCut *)firstCut;
- (EXMapCut *)lastCut;

- (void)addCut: (EXMapCut *)cut;
- (void)removeAllCuts;


// 
======================================================================== 
===
#pragma mark --- GENERAL METHODS
// 
======================================================================== 
===

- (NSString *) description;




// 
======================================================================== 
===
#pragma mark --- UTILITY & CONVERTER METHODS
// 
======================================================================== 
===

- (NSComparisonResult)sortMCSOnPositionDescending:(EXMapMCS*) mcs;
- (NSComparisonResult)sortMCSOnPositionAscending:(EXMapMCS*) mcs;

- (NSComparisonResult)sortMCSOnCountDescending:(EXMapMCS*) mcs;
- (NSComparisonResult)sortMCSOnCountAscending:(EXMapMCS*) mcs;




@end



Op 5-aug-04 om 16:45 heeft John Timmer het volgende geschreven:

> I'm going to be writing two methods:  One gives a list of all ORFs  
> over a
> certain size given the size and a DNA sequence, the second will list  
> all
> sites in a sequence, given a site and the sequence.
>
> The question is:  how to return the list?
>
> I could see three options:
> The sites could be handled as an NSIndexSet, but that won't work for  
> the
> ORFs and is 10.3 only.
> Another option would be to store Ranges as NSValues and return an  
> array of
> them.  This would be very convenient internally, but wouldn't allow
> convenient saving of the information, since NSValues would need to be
> encoded before saving.
> The final thing would be to add a category to NSDictionary that would  
> add
> the methods "storeRangeInDictionary" and "retrieveRangeFromDictionary"  
> that
> would just make length and location keys in a dictionary.
>
> Does anyone have a preference about how to handle it?
>
> Cheers,
>
> John
>
> _______________________________________________
> This mind intentionally left blank
>
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>
*********************************************************
                       ** Alexander Griekspoor **
*********************************************************
                 The Netherlands Cancer Institute
                 Department of Tumorbiology (H4)
           Plesmanlaan 121, 1066 CX, Amsterdam
                     Tel:  + 31 20 - 512 2023
                     Fax:  + 31 20 - 512 2029
                    AIM: mekentosj at mac.com
                     E-mail: a.griekspoor at nki.nl
                 Web: http://www.mekentosj.com

           LabAssistant - Get your life organized!
           http://www.mekentosj.com/labassistant

*********************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 18225 bytes
Desc: not available
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20040805/4cdc5285/attachment.bin>


More information about the Biococoa-dev mailing list