[Biophp-dev] Genephp and Biophp
Nico Stuurman
biophp-dev@bioinformatics.org
Mon, 28 Apr 2003 18:02:10 -0700
> I'm certain a "middle ground" can be worked out, but there isn't
> much point in immediately "throwing together" both sets of code as they
> are now - the two sets of code are CURRENTLY approaching from different
> (somewhat incompatible) philosophies.
>
> GenePHP is "a library", and currently seems to be tightly connected to
> itself
> - for example, the way the GenePHP::Parsers all return only
> GenePHP::seq
> objects.
>
> BioPHP is "modules", designed to be independent as possible but
> interoperable
> (the parsers stand alone and can be used outside the context of other
> BioPHP
> modules, for example, and they return strings or arrays than can
> either be
> imported into a sequence object or used independently).
>
I have to confess that I have not yet looked to much at Sean's code, in
part since I couldn't find it since a few days ago. However, I don't
completely understand the idea of modules that return arrays and
strings. I am talking from the point of view of an application
programmer (and I do think that both you and Serge target application
programmers). For parsers, for instance, I would want them to return
the same data structure independent of the format I put into them.
Also, the parsers should have the same API independent of the data they
are parsing. Otherwise, it will become a pretty confusing bunch of
scripts. To me, it seems quite logical to use a class as the output if
the parsers, but the only thing that is important is that every parser
returns the same datastructure and that they all have the same
behavior. The parse class that is now in the biophp cvs (under
genephp) does just that (and you can easily use that class - with the
seq class - without the rest of the genephp classes.
> So, if we throw everything together as it is, pretty much everything
> I've
> done disappears anyway, since it doesn't tightly couple into the
> GenePHP
> layout.
>
I don't think so, your esearch modules should be immediately useful
(now cvs works, I'll have a look at it soon). We are all here talking
about setting something up that will be as useful as possible to
ourselves and everyone else. We will all sometimes write code that
becomes superfluous, but the fun thing here is that we can learn from
each other and help each other and make something that is much better
than when we were doing it alone. Sometimes it can hurt a little, but
if we all try work together it will be much more rewarding in the long
run.
> The solution/"middle ground" would mean GenePHP being somewhat
> redesigned
> to allow for more "independent" modules to be a part of it (e.g. to
> pass
> data more often between modules/classes as standard arrays rather than
> GenePHP objects, so as to make it easier to "drop in" new components)
> but
> I'm reluctant to ask that. Not because I don't think it'd be a good
> idea
> (obviously I do) but because I hate the idea of making demands that
> other
> independent volunteer programmers rewrite their code to accomodate me.
> Maybe
> I'm just insecure that way :-)
I think that independent programmers will be very thankful for simple,
well designed data-structures that they can use without fuss in their
programs. I completely agree with you that we should work towards
small self-contained modules that can also be used as much as possible
without needing everything and the kitchen sink. However, if you don't
structure the data that are coming in, then why would an application
programmer even bother to use your code?
As an example, you can now throw 4 different types of sequence data
file formats at the parse class (as a file or as a string), and it will
return the same kind of data structure for each of them. The only
thing I have to do in my phplabware project is to write the class to
the structure I have in my SQL database.
> I DO think a less-tight coupling would make it easier to contribute
> modules
> to the framework, but that's me.
It will be easiest if we all decide on the underlying datastructures we
are going to use (and Serge is doing a great job there, even though I
don't agree with everything he suggests). Once the datastructures are
in place (and they are good), programming will be easy (actually, the
better the datastructures, the easier the programming).
B.t.w I just read the term 'standard arrays' in your previous
paragraph. Aren't 'standard arrays' and Genephp objects the same? I
mean the classes that Serge proposes are open to discussion (I hope)
and classes are nothing more than arrays to which you can add
functions. So, aren't we talking about the same thing?
Again, I hope we can do this all together
Best,
Nico