> > OK. Are you going to set this up? Once there is an example and some > > structure in place it will be possible for others to extend it. > > I will if you'd like, but only if you don't mind - it's bad enough > that I've inadvertently dumped all over all the work you've just > done without me editing it without asking first... No problem. I simply want it to work, to be as good as possible, and be available ASAP. > > Presuming you don't mind, code-wise, here is what I'll do: > > 1."wrap" the existing parser functions into classes > complete with file reading code > OK. So they will all have their own move_Next(), move_Previous, eof(), bof(), fetch() and probably also move_First(), move_Last(), move_To() functions?. I guess the parser class constructors should take either a filename or a string as an argument. Preferably, there should be a way to maintain and use an index in a file (as in Serge's seqdb class). Hmmm, this is all straight forward to do in memory (like it is now), but probably more difficult with a stream (what streams other than files are there in php? php treats URLs almost exactly like a file, so...) How important is it to deal with datastructures larger than the available memory? B.t.w., since we now have this large list of methods that every class should have, Jo Dough will still have to do quite some work to get his class to work in Biophp. If possible, why not have them in the calling class so that they don't have to be repeated over and over again? > 2.add instantiation of the (filetype)_parser class > Will this mean that I instantiate a class Parser, and class Parser finds the approriate (filetype)_parser class for me? If so, it would be cool to keep the current include scheme, where only the required parser is actually included in the running script. > 3.edit the fetch() function to reflect that it's calling a > method from the filetype parser instead of one of its own > methods so fetch(),will simple call (filetype)_parser->fetch()? > > 4.create a "seq_factory.inc.php" class to churn out the sequence > objects > so fetch() will get a datastructure from the parser, feed this to object seq_factory,and get a Seq object back, which it sends back to the calling party? > 5.move the creation of the seq object up to the "Parse" class, via > the "seq_factory". Sounds OK to me. I would still think about keeping the user functions (move_Next(), etc..) in the Parse class (provided it is possible, I don't think interleaved data are a big problem with this scheme). That will keep the individual parsers more simple. The seq_factory is fine with me (it is a good idea to keep the translation from what is in the file to our abstraction of the real world in one place). The only issue is how to deal with stuff both in memory (small files, strings, these are currently kept in an array of strings and an index of the line-numbers with sequence entries is maintained) and in streams (big files, we read from a file pointer). Is there an easy way to deal with both? Nico B.t.w what a relief to write and read about stuff that matters!