[Biophp-dev] Egad, this is getting long :-)

biophp-dev@bioinformatics.org biophp-dev@bioinformatics.org
Tue, 29 Apr 2003 16:47:31 PST

-- snip about the usual amount of unnecessary information--

> I am advocating that the direct handling of the files(streams/strings) be 
> abstracted out into dedicated classes, which concern themselves only with
> reading the data and separating it into sequence data.  It would return a 
> "standard" (by which I mean "agreed upon") ordinary array, containing at 
> a minimum a "label" (short name) and the actual sequence as the
> first two elements. (The returned data would NOT be different for 
> every parser, only not dependent on externally-declared data structures).
> The "Parser" class stays exactly as it is:
> It handles detection of the data type if it is not specified.
> It instantiates the appropriate parser object and tells it to go.
> It hands out the sequence objects created from the parsers' output.
> with only two minor changes:
> 1)the actual "moving back and forth in the file" (move_Next()
> move_Previous() 
> methods) and EOF and so on end up down in the individual parsing
> classes - in
> some file types this may require special handling (e.g. "streams"
> can't really
> be rewound - in those cases a call to the "move_Previous()" method would
> simply return false)
> 2)The CREATION of the sequence object be abstracted out to a "seq_factory"
> class.  This is isn't really necessary to the goal of just abstracting the
> data parsers, but I think it would be very handy (and useful outside
> of the
> Parser object, for example in the event that that someone wants a set 
> of seq
> objects for fragments from the resten class instead of only strings.) 
> If the
> structure of the sequence object is ever changed, the abstraction
> provided by
> the "factory" object protects the objects further out from the
> "center" of the 
> GenePHP structure from having to worry about it.  It also means that, for
> example, if we ever decide we want to add support for some OTHER sequence
> format (perhaps to let the BioJava guys make use of some of our PHP
> routines)
> all that would be needed is an additional "factory" object and all of
> the classes that generate sequence data can immediately support it without
> any real changes.
> That's all.  The "end-usage" of the Parser object doesn't change at
> all (like

OK.  Are you going to set this up? Once there is an example and some
structure in place it will be possible for others to extend it.