On Sunday 06 April 2003 11:35 pm, Dong Gregorio wrote: > Welcome, Greg! Nice to see another "Greg" on board. =) > > As for the build, I'm PHP (4.2 I think) on a Windows machine, > and Apache 1.3. Haven't found the time yet to get the latest. > How about you, Sean? What are you using? A CVS snapshot from about a week ago (4.3.2 pre-release) built as CLI. I need to update to a more recent snapshot as they've apparently fixed the problem installing the Java support when you're only building the CLI version... On my server, it's 4.3.something with apache 1.3.27(as I recall) > >Is it worth all the extra hassle ? In the mood for opening a "can of > >worms" is see ;] > > Hehe, well, Sean, how about it? =) As I once saw in a .sig on Slashdot: "The can's already open, the worms are EVERYWHERE..." :-) I just keep having trouble really CONVINCING myself that the apparently complex framework of tag-handling functions that you seem to have to build to parse XML with an "official" parser is really any better than easier-to-follow regular expressions (DESPITE the fact that on a purely "rational" level I can certainly see that at a certain point the complexity of a set of regular expressions for parsing eventually becomes worse than the XML parser, but still...) I intend to stick as much as possible with "default" capabilities in PHP (i.e. the SAX parser, which is included in PHP builds by default, rather than XML-DOM which has to be specifically asked for at compile-time), which in this case is probably just as well - with DOM you have to load the entire XML structure into memory before you can start doing anything with it, whereas the SAX parser deals with it bit by bit as it comes in. Considering how large some of the datafiles we may end up dealing with can be... I'm thinking I should set up a "utilities" directory of classes with not-specifically-bioinformatics classes for dealing with basic things like POST'ing queries to web interfaces (and the "core" XML parser class). > >- BioPHP consistency -> many "bio" formats are moving to xml This much is true, and my gut feeling is that at least in my case, once I FINALLY get to the point where I have a "feel" for how to actually use XML parsers that it won't be too bad. I have to confess my weekend has been spent "slacking off" (well, if you can count manual labor hauling boxes out of storage and such as "slacking off") so I haven't yet gotten to looking at the example code that Greg was kind enough to forward - I'll try to get to that tomorrow. With "parsing philosophy" being the real holdup at this point for me, I really need to get on that. Once done, the rest ought to be comparatively easy... > >- Differentiate BioPHP as fundamentally supporting XML That thought HAD crossed my mind, but since it seems you have to code up tag/structure handling functions for each document type ANYWAY, and since ideally the "guts" of the BioPHP classes will be a "black box" to people using it, I'm not sure that's inherently worth more than mere "bragging rights" (not that I don't care about "bragging rights", but...) (Of course, being open source, these "black boxes" have metaphorical easy-open latches on them so people can look at that guts if they really want to :-) ) > >- Why bother with flatfiles ? BioPerl/Python/Java probably do > > these already BioPerl/Python/Java/Ruby/Lisp also already deal with DNA sequences... should we ignore them as well, then? :-) Besides, I'd consider formats such as Phylip, Clustal, ASN.1, etc. to be "flat files", and parsers for them will be handy.