[Biophp-dev] Seeking comments on CVS, XML, and other TLA's

biophp-dev@bioinformatics.org biophp-dev@bioinformatics.org
Thu, 10 Apr 2003 11:56:22 -0600

On Wednesday 09 April 2003 10:37 pm, Greg Tyrelle wrote:
> I agree that for simple XML documents regular expression will
> generally suffice. My only argument for using SAX for the ESearch XML
> output is consistency.
> However using the SAX API as an entry point for either a regex based
> parser or an XML parser might be useful. The point here is that event
> based parsing is good and it's not only XML parsers that can trigger
> SAX events :)

Well, true, and I actually agree completely.  Plus, dealing with everything
via SAX will be "good for me".

> I'm not sure I understand what you mean by "task oriented" in the
> above example ?

My continued chanting of "task oriented" just refers to my thought that
when deciding what to write and how to design it, one should ALWAYS 
(at least for this project) start with a "real-world" task, RATHER THAN
a "capability.

For example, one shouldn't start with "we should have an ESearch interface"
(a "capability"), but rather "I want to be able to search [for example]
PubMed for articles." (a "task" - for which an ESearch interface seems to
be the best approach, along with an ESummary and/or EFetch module(s)).
The thinking behind this is that modules, as soon as they can "do" ANYTHING,
they can do something immediately useful for a "real-world" task - this also
guides what "parts" of a module to start with (i.e. in this example, doing
anything special to deal with other-than-PubMed can be left out of the
0.0.01 release, and dealing with the other data types gets added as "tasks"
come up that use them.)  In short, modules then get designed and released
in smaller sets of changes but with always-useful-for-"real"-tasks features
being added...

Does that make any sense?...

> Personally I would find a "BioPHP proposal" document a better starting
> point to work from if you want to discuss design issues ? For example
> a comparison of how the various Bio* projects are structuring their
> modules/interfaces might be useful ?

I have thus far not concerned myself with looking TOO deep at other BioXXXX
project's capabilities (other than to get a general "feel" for what they can
/are designed to do), because I don't consider 'do the same thing as 
BioPerl/Python/Lisp/Ruby/Java' to be a valid "task". :-)

Where *I* was hoping to start with a design document is with "what kinds of
things do we want to be able to DO with BioPHP", and structure the layout
according to how the "tasks" involved can be categorized (to the maximum 
REASONABLE extent).  In my case at first thought, roughly, I would break it
down into "data conversion" (import/export from different formats), data
retrieval (e.g. online queries ), and Analyses (both via frontends and
"native" processing).

On the other hand, as "learning better proper software design" is one of my
explicitly stated "personal" goals in this project, I figure other developers
will point out to me where my ideals are unfeasible or otherwise wrong-headed. 
Even if I can technically call this "my" project at the moment (at least until
there are "formally listed" other developers, which will be Real Soon Now, I
suspect), I'm really not a "rule with an iron fist" kind of guy :-) Therefore,
as people convince me by rational argument that I'm being an idiot, I will be
changing my opinions...

> In the mean time I'll try and find some time to look over your
> sequence list class.

Thanks.  I disclaim any responsibility, however, should reading it make your
brain coagulate, though (as I mentioned, I'm sure I've picked up a number
of unusual idiosyncracies in my coding style...).