[Biococoa-dev] Design question
Alexander Griekspoor
mek at mekentosj.com
Thu Aug 5 14:10:29 EDT 2004
Hi John and others,
My thoughts on this issue, starting with replying the points you
brought up.
> I could see three options:
> The sites could be handled as an NSIndexSet, but that won't work for
> the
> ORFs and is 10.3 only.
As we discussed, 10.2 will be the target OS, I guess that leaves this
option.
> Another option would be to store Ranges as NSValues and return an
> array of
> them. This would be very convenient internally, but wouldn't allow
> convenient saving of the information, since NSValues would need to be
> encoded before saving.
True, but wait to see my answer on that later.
> The final thing would be to add a category to NSDictionary that would
> add
> the methods "storeRangeInDictionary" and "retrieveRangeFromDictionary"
> that
> would just make length and location keys in a dictionary.
I think it is handy to add a number of these categories anyway, but
then name them according to the general scheme, which would be
-rangeForKey: and -setRange:forKey: Similarly, I often have to add
-rectForKey, colorForKey myself, it would be nice to add all these to
the framework.
> Does anyone have a preference about how to handle it?
My option is not there ;-) Option 4 would be the thing I tend to end up
with everytime I tackle these kind of problems, and is a solution both
very Cocoa like, and very BioPerl/BioJava like, that is to model
everything to classes. My opionion is to write a class for each
"module" there is in the real world. In this case, a "restriction
enzyme class", a "digestion class", a "cut class", a "dna fragment"
class (which can be just the general "dna sequence class") etc. I have
used it in the upcoming version of EnzymeX and have attached the header
files of to classes to give you an idea. The nice thing is that it's
now easy to extend things and model things. For instance, to do the
drawing of all cuts in a plasmid I run into the problem that I had (to)
many enzymes drawing overlappingly in multiple cloning sites. My
multiple cloning site class was the solution, as I could separate the
"cuts" into in- and outside multiple cloning sites and draw both
categories differently. Also the attached digest class (which
represents a cut fragment) is very elegant as it allows sorting on
length, cutposition etc. The nice thing is that you can add these
things to arrays and dictionaries as well.
To further comment on the attached classes. Of course NSCoding support
should be added, which is simple and add direct reading and writing of
arrays and dictionaries that contain these objects. Also, I should
better stick to keyvalue coding (it's 90% now) as it is the basis for
bindings. Don't look to much to the class nomenclature here, we should
clearly pick better names. Finally, see how easy it is to add stuff
like sorting for instance: [NSArray sortedArrayUsingSelector:
@selector(sortResultsOnLengthDescending:)]; returns an array with all
fragments sorted on size.
Also, I have added the - (NSString *) description; method which makes
debugging so much easier. The implementation:
- (NSString *) description{
if([self nrOfCuts] > 0) return [NSString stringWithFormat:
@"EXMapMCS: %d %@ ---- %d %@, %d cuts", [[self firstCut]position],
[[self firstCut]enzyme], [[self lastCut]position], [[self
lastCut]enzyme], [self nrOfCuts]];
else return @"EXMapMCS: empty";
}
Now you can just call on an array: NSLog(@"%@", [myArrayWithFragments
objectAtIndex: 0]);
and it shows you all details of the first object in the array.
One could think implementing many methods in these objects like
stringRepresentation: for a DNA sequence class, or -complement;
Finally, the really nice thing which saved a lot of time for EnzymeX
was adding the
- (BOOL)hitTest: (int)pos;
method. I wanted users to be able to select a fragment or cut by
clicking. Now it's simply a matter of asking each fragment in the array
in a loop if it is "hit", and there you go. Of course, we should give
the method a different, more general name, like -containsPosition; or
startsAt:
These are the socalled convenience methods, which make life, well a
convenient ;-)
All in all, I think this option will give us a versatile and super
flexible/extensible framework. Everyone can add small, but handy
methods without worrying about breaking stuff. Moreover, it allows easy
passing of objects throughout the framework. For instance, say we
wanted to add alignments, as input we would use our own objects. The
seqio controller knows how to write these/convert these, etc. This also
leaves plenty of room for attaching bindings, exporting to indexsets
etc.
This is what in my opinion the BCFoundation should look like, and I
think is something also the other bio... frameworks have choosen. Many
classes representing all kinds of instances we use. This is also the
way Cocoa's foundation works (nsstring, nsrect, nsdictionary, nsarray,
nsrange etc).
The only question left is, do we need a controllerobject for certain
tasks. For instance, a "Alignment controller", a "Digestion
controller"? I think that that wouldn't be such a bad idea. Normally
this goes into your application's code, like EnzymeX is doing all
drawing, supervises the hit test, digestion, etc. As we are writing a
framework, we can't rely on other people writing this, it would simply
be to much work to figure out how we, the biococoa developers, would
like to see how they should do it. Therefore, we have to do that
ourselves as well, to use the framework one would only need to
instantiate a controller which does all the logic behind the scene. The
user would just have to tell the "Digestion controller the sequence,
enzymes etc, and he would get back a fragment array. That is what my
ideal world would look like.
That leaves the question, how to implement this all, where to start. I
think we should try to get a basic foundation first, the simple
objects. We should also decide what to call them, perhaps we could fill
in the scheme John made in further detail. Maybe for compatibility
reasons largely based on what BioJava/Perl have done, of course only if
that makes sense, and we should keep the Cocoa names as templates. I am
willing to prototype a few of these classes based on the attached
header below, if you guys think that's a proper basis of course. If
that's done, we can start adding methods like the ones John was
describing, which should be rewritten to use our BCFoundation objects.
The nice thing is that others can safely add methods to the objects
without breaking the code John add in. Also very nice, John can add
convenience methods in those classes as well as he finds out that these
can help simplifying his ORF finding method.
Well, enough writing, I'm sure this enough to comment on for everyone
;-) I'm curious what you guys think of this, let me know if you think
differently or have something to add. After that we can talk about the
practical implementation...
Cheers,
Alex
//
// EXMapDigest.h
// EnzymeX
//
// Created by Alexander Griekspoor on Fri Nov 07 2003.
// Copyright (c) 2003 __MyCompanyName__. All rights reserved.
//
#import <Foundation/Foundation.h>
@interface EXMapDigest : NSObject {
//
========================================================================
===
#pragma mark --- VARIABLES & PROPERTIES
//
========================================================================
===
int start;
int end;
int startcut5;
int endcut5;
int constructlength;
NSString *startEnzyme;
NSString *endEnzyme;
int startCutPosition;
int endCutPosition;
BOOL isSelected;
}
//
========================================================================
===
#pragma mark --- INIT & DEALLOC
//
========================================================================
===
- (id)init;
- (void)dealloc;
//
========================================================================
===
#pragma mark --- ACCESSOR METHODS
//
========================================================================
===
- (int)start;
- (void)setStart:(int)newStart;
- (int)end;
- (void)setEnd:(int)newEnd;
- (int)startcut5;
- (void)setStartcut5:(int)newStartcut5;
- (int)endcut5;
- (void)setEndcut5:(int)newEndcut5;
- (int)constructlength;
- (void)setConstructlength:(int)newConstructlength;
- (NSString *)startEnzyme;
- (void)setStartEnzyme:(NSString *)newStartEnzyme;
- (NSString *)endEnzyme;
- (void)setEndEnzyme:(NSString *)newEndEnzyme;
- (BOOL)isSelected;
- (void)setIsSelected:(BOOL)newIsSelected;
//
========================================================================
===
#pragma mark --- GENERAL METHODS
//
========================================================================
===
- (NSString *) description;
- (int)startCutPosition;
- (int)endCutPosition;
- (int)length;
- (float)percentage;
- (BOOL)hitTest: (int)pos;
//
========================================================================
===
#pragma mark --- UTILITY & CONVERTER METHODS
//
========================================================================
===
- (NSComparisonResult)sortResultsOnPositionDescending:(EXMapDigest*)
dig;
- (NSComparisonResult)sortResultsOnPositionAscending:(EXMapDigest*) dig;
- (NSComparisonResult)sortResultsOnLengthDescending:(EXMapDigest*) dig;
- (NSComparisonResult)sortResultsOnLengthAscending:(EXMapDigest*) dig;
@end
//
// EXMapMCS.h
// EnzymeX
//
// Created by Alexander Griekspoor on Fri Nov 07 2003.
// Copyright (c) 2003 __MyCompanyName__. All rights reserved.
//
#import <Foundation/Foundation.h>
@class EXMapCut;
@interface EXMapMCS : NSObject {
//
========================================================================
===
#pragma mark --- VARIABLES & PROPERTIES
//
========================================================================
===
NSMutableArray *cuts;
NSRect rect;
}
//
========================================================================
===
#pragma mark --- INIT & DEALLOC
//
========================================================================
===
- (id)init;
- (void)dealloc;
//
========================================================================
===
#pragma mark --- ACCESSOR METHODS
//
========================================================================
===
- (NSMutableArray *)cuts;
- (void)setCuts:(NSMutableArray *)newCuts;
- (NSRect)rect;
- (void)setRect:(NSRect)newRect;
- (int)nrOfCuts;
- (EXMapCut *)firstCut;
- (EXMapCut *)lastCut;
- (void)addCut: (EXMapCut *)cut;
- (void)removeAllCuts;
//
========================================================================
===
#pragma mark --- GENERAL METHODS
//
========================================================================
===
- (NSString *) description;
//
========================================================================
===
#pragma mark --- UTILITY & CONVERTER METHODS
//
========================================================================
===
- (NSComparisonResult)sortMCSOnPositionDescending:(EXMapMCS*) mcs;
- (NSComparisonResult)sortMCSOnPositionAscending:(EXMapMCS*) mcs;
- (NSComparisonResult)sortMCSOnCountDescending:(EXMapMCS*) mcs;
- (NSComparisonResult)sortMCSOnCountAscending:(EXMapMCS*) mcs;
@end
Op 5-aug-04 om 16:45 heeft John Timmer het volgende geschreven:
> I'm going to be writing two methods: One gives a list of all ORFs
> over a
> certain size given the size and a DNA sequence, the second will list
> all
> sites in a sequence, given a site and the sequence.
>
> The question is: how to return the list?
>
> I could see three options:
> The sites could be handled as an NSIndexSet, but that won't work for
> the
> ORFs and is 10.3 only.
> Another option would be to store Ranges as NSValues and return an
> array of
> them. This would be very convenient internally, but wouldn't allow
> convenient saving of the information, since NSValues would need to be
> encoded before saving.
> The final thing would be to add a category to NSDictionary that would
> add
> the methods "storeRangeInDictionary" and "retrieveRangeFromDictionary"
> that
> would just make length and location keys in a dictionary.
>
> Does anyone have a preference about how to handle it?
>
> Cheers,
>
> John
>
> _______________________________________________
> This mind intentionally left blank
>
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
LabAssistant - Get your life organized!
http://www.mekentosj.com/labassistant
*********************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 18225 bytes
Desc: not available
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20040805/4cdc5285/attachment.bin>
More information about the Biococoa-dev
mailing list