[Biophp-dev] Genephp and Biophp

Nico Stuurman biophp-dev@bioinformatics.org
Mon, 28 Apr 2003 18:02:10 -0700


> I'm certain a "middle ground" can be worked out, but there isn't
> much point in immediately "throwing together" both sets of code as they
> are now - the two sets of code are CURRENTLY approaching from different
> (somewhat incompatible) philosophies.
>

> GenePHP is "a library", and currently seems to be tightly connected to 
> itself
> - for example, the way the GenePHP::Parsers all return only 
> GenePHP::seq
> objects.
>
> BioPHP is "modules", designed to be independent as possible but 
> interoperable
> (the parsers stand alone and can be used outside the context of other 
> BioPHP
> modules, for example, and they return strings or arrays than can 
> either be
> imported into a sequence object or used independently).
>

I have to confess that I have not yet looked to much at Sean's code, in 
part since I couldn't find it since a few days ago.  However, I don't 
completely understand the idea of modules that return arrays and 
strings.  I am talking from the point of view of an application 
programmer (and I do think that both you and Serge target application 
programmers).  For parsers, for instance, I would want them to return 
the same data structure independent of the format I put into them.  
Also, the parsers should have the same API independent of the data they 
are parsing.  Otherwise, it will become a pretty confusing bunch of 
scripts.  To me, it seems quite logical to use a class as the output if 
the parsers, but the only thing that is important is that every parser 
returns the same datastructure and that they all have the same 
behavior.  The parse class that is now in the biophp cvs (under 
genephp) does just that (and you can easily use that class - with the 
seq class - without the rest of the genephp classes.


> So, if we throw everything together as it is, pretty much everything 
> I've
> done disappears anyway, since it doesn't tightly couple into the 
> GenePHP
> layout.
>

I don't think so, your esearch modules should be immediately useful 
(now cvs works, I'll have a look at it soon).  We are all here talking 
about setting something up that will be as useful as possible to 
ourselves and everyone else.  We will all sometimes write code that 
becomes superfluous, but the fun thing here is that we can learn from 
each other and help each other and make something that is much better 
than when we were doing it alone.  Sometimes it can hurt a little, but 
if we all try work together it will be much more rewarding in the long 
run.


> The solution/"middle ground" would mean GenePHP being somewhat 
> redesigned
> to allow for more "independent" modules to be a part of it (e.g. to 
> pass
> data more often between modules/classes as standard arrays rather than
> GenePHP objects, so as to make it easier to "drop in" new components) 
> but
> I'm reluctant to ask that.  Not because I don't think it'd be a good 
> idea
> (obviously I do) but because I hate the idea of making demands that 
> other
> independent volunteer programmers rewrite their code to accomodate me. 
> Maybe
> I'm just insecure that way :-)


I think that independent programmers will be very thankful for simple, 
well designed data-structures that they can use without fuss in their 
programs.  I completely agree with you that we should work towards 
small self-contained modules that can also be used as much as possible 
without needing everything and the kitchen sink.  However, if you don't 
structure the data that are coming in, then why would an application 
programmer even bother to use your code?
As an example, you can now throw 4 different types of sequence data 
file formats at the parse class (as a file or as a string), and it will 
return the same kind of data structure for each of them.  The only 
thing I have to do in my phplabware project is to write the class  to 
the structure I have in my SQL database.

> I DO think a less-tight coupling would make it easier to contribute 
> modules
> to the framework, but that's me.

It will be easiest if we all decide on the underlying datastructures we 
are going to use (and Serge is doing a great job there, even though I 
don't agree with everything he suggests).  Once the datastructures are 
in place (and they are good), programming will be easy (actually, the 
better the datastructures, the easier the programming).


B.t.w I just read the term 'standard arrays' in your previous 
paragraph.  Aren't 'standard arrays' and Genephp objects the same?  I 
mean the classes that Serge proposes are open to discussion (I hope) 
and classes are nothing more than arrays to which you can add 
functions.  So, aren't we talking about the same thing?


Again, I hope we can do this all together



Best,


Nico