[Biophp-dev] My brain hurts.

S Clark biophp-dev@bioinformatics.org
Sun, 4 May 2003 01:42:18 -0600


On Saturday 03 May 2003 08:33 pm, Nico Stuurman wrote:
> What about coding lazy and simply have the parser re-open the stream?
> Unless it costs much to open stream, there will not be much in terms of
> performance penalty and it will make the code look prettier.

That'll work fine for MOST things, the only place where that becomes
an issue is with parsers for results from online database queries.
I get the impression some of them would get annoyed having the extra
workload on their server for doing the same lookup twice in a row.  In
some cases it might not work at all (e.g. NCBI Blast, perhaps?  Queries
that are given "special" session-based result URL's?).

I don't think that'll come up very often though.  On the other hand, I THINK
it won't take much to allow the filetype parser to accept "some text and
a 'continued' source" as one allowed set of messages to it, which will take
care of the situation if I can make it work correctly (i.e. the auto-detection
reads what it needs, then passes what it read AND the filepointer that
it's read from to the parser object, which parses the text and then continues
on with the stream).

> Something like a method readRecordinArray() which simply fills an array
> with lines until it finds the end of record mark (or whatever clues
> there are it is a the end of a record).  The array is then passed to
> the 'real' parser that only works with arrays of lines.

Hmmm...yeah, I think that can be implemented without too much hassle.  Should
take care of the excess memory usage issue.

> Right, who would ever want to go back.  It just seems good design to
> allow for it....

Darn, that's just what I was thinking i.e. I can't think of why anyone
would...but that pesky "gut feeling" is telling me that SOMEONE will want it, 
especially if we were to remove it :-)

Okay - I THINK I've correctly committed the changes and new files - if it
looks like I've screwed up the import let me know.  Hopefully I'll be able
to allocate a chunk of time tomorrow to get some of these coding issues worked
out.