On Wednesday 09 April 2003 12:03 am, Dong Gregorio wrote: [...] > Uhuh, and what does that code supposed to do? Let's take a look > at it. Well, the part *I* was interested in was a SAX XML (or is it XML SAX?) parser for ESearch - I wanted to see how it was implemented. (That was in the set of code I mentioned in the previous message). I was right - it IS a pain. :-) At least, it LOOKS that way. Put simply, it appears to have the same need to have an individual "function"(/pattern) written for every single tag that you want to get the data out of that using regular expressions do, with the added overhead of having to break things up into a "recognize the start tag", "recognize the end tag", and "do stuff with the contents between the two tags" functions and tracking the "depth"/parentage via some sort of ueber-variable (all the online example parser scripts seem to use global variables, but I suspect "everybody" just uses top-level variables in parser objects...) For really simple XML documents (which I would count ESearch results as) regular expressions just seem so much easier, though it seems pretty clear that as documents get more complex the overhead of a formal XML parser become more worthwhile. Plus, for consistency, I still think a real XML parser is called for wherever possible. It's just that I'm still going throught the "mental temper tantrum" of convincing myself to do it... > Well, that's a whole debate in itself. To rephrase it: > > Do we work on features that would differentiate BioPHP from BioXXX (and > work backwards later on) or do we through the basics first (data file > parsing, sequence analysis, etc.)? The answer, of course, is "yes". :-) I hadn't really intended to EXPLICITY differentiate BioPHP from other BioXXXXX projects as such, just that I wasn't intending to simply "re-implement" those other projects in PHP, though at the same time I am hoping to have functionality for handling things that (as far as I know) the other BioXXXX projects haven't gotten around to, such as working with phylogeny (or perhaps dealing with HPLC chromatograms of cell metabolites and other such analyses?) [actually - I just looked at the BioPerl Docs and they do have some support for parsing phylogenetic analysis from e.g. PAML] I'm hoping to approach the "basics" as they become necessary to accomplish specific tasks (which may or may not differentiate BioPHP from BioXXXX - some tasks will, some won't), rather than to approach with a focus of "BioPerl has a Bio::Tools::Phylo::PAML::Result object, so we need to write one for BioPHP" (for example). My personal (and admittedly inexperienced) opinion is that our structure be more "task oriented", so the EQUIVALENT to the aforementioned module might be categorized more like "Bio::Phylogeny::Frontends::PAML" (comments to correct any gross ignorance of good design that this opinion may reveal are welcome - after all, one of my primary reasons for this project is educating myself. Thankfully, I like to think I'm a pretty fast learner. ) Incidentally, if anyone's bored and wants to critique my current presumably-grossly-idiosyncratic style, the sequence list object should be a fairly representative example. I'm particularly interested in how easy other people find it to understand, and how and where I tend to depart from what is generally considered good design practice... > Taking the first approach would mean developing interfaces to other > BioXXX (because we can't even do the simplest tasks ourselves using > BioPHP). Taking the second approach means it would take some time > before we earn "bragging rights" vis-a-vis BioXXX. Bragging rights being, > "My dog can sing while standing on its head, yours cant!" * Laugh out loud > * With the possible exception of optional interfaces to BioJava for processor-intensive analyses, I don't foresee any need to have interfaces to other BioXXXX projects at the moment. Well, unless, of course, it would be easier to just use BioPerl's Bio::Dog::Activity::Singing::SongParsers::XML module instead of writing our own. (I don't think PHP has been ported to the "Chow-Mixed-Breed" platform that my dog runs on, anyway...)