[Biophp-dev] Seeking comments on CVS, XML, and other TLA's

Greg Tyrelle biophp-dev@bioinformatics.org
Thu, 10 Apr 2003 14:37:58 +1000

*** S Clark wrote: 
  |On Wednesday 09 April 2003 12:03 am, Dong Gregorio wrote:
  |> Uhuh, and what does that code supposed to do?  Let's take a look
  |> at it.

It supposed to interface with NCBI's EUtils, the code is just a simple
client interface with a couple of SAX parsers for the ESearch and
PubmedArticleSet XML output that EUtils produces.

My plan was to rewrite an interface I wrote some time ago for my
website (www.nodalpoint.org).

  |For really simple XML documents (which I would count ESearch results
  |as) regular expressions just seem so much easier, though it seems
  |pretty clear that as documents get more complex the overhead of a 
  |formal XML parser become more worthwhile.  Plus, for consistency, 
  |I still think a real XML parser is called for wherever possible.

I agree that for simple XML documents regular expression will
generally suffice. My only argument for using SAX for the ESearch XML
output is consistency.

However using the SAX API as an entry point for either a regex based
parser or an XML parser might be useful. The point here is that event
based parsing is good and it's not only XML parsers that can trigger
SAX events :)

  |It's just that I'm still going throught the "mental temper tantrum" 
  |of convincing myself to do it...

Understood :)

  |> Well, that's a whole debate in itself.  To rephrase it:
  |> Do we work on features that would differentiate BioPHP from BioXXX (and
  |> work backwards later on) or do we through the basics first (data file
  |> parsing, sequence analysis, etc.)?
  |My personal (and admittedly inexperienced) opinion is that our
  |structure be more "task oriented", so the EQUIVALENT to the aforementioned
  |module might be categorized more like "Bio::Phylogeny::Frontends::PAML" 
  |(comments to correct any gross ignorance of good design that this opinion
  |may reveal are welcome - after all, one of my primary reasons for this
  |project is educating myself.  Thankfully, I like to think I'm a pretty
  |fast learner. )  Incidentally, if anyone's bored and wants to critique
  |my current presumably-grossly-idiosyncratic style, the sequence list object
  |should be a fairly representative example.  I'm particularly interested in
  |how easy other people find it to understand, and how and where I tend to
  |depart from what is generally considered good design practice...

I'm not sure I understand what you mean by "task oriented" in the
above example ?

Personally I would find a "BioPHP proposal" document a better starting
point to work from if you want to discuss design issues ? For example
a comparison of how the various Bio* projects are structuring their
modules/interfaces might be useful ?

In the mean time I'll try and find some time to look over your
sequence list class.


Greg Tyrelle  (http://www.kinglab.unsw.edu.au/~greg)

"Logic only gives man what he needs, 
 magic gives man what he wants" - Tom Robbins