[Biophp-dev] Pdraw filetype parser updated

S Clark biophp-dev@bioinformatics.org
Wed, 7 May 2003 23:52:04 -0600


On Wednesday 07 May 2003 10:31 pm, nicos@itsa.ucsf.edu wrote:
> I added the pdraw parser (there is a test file specific for this parser).
> Although I started with Sean's fasta parser, it changed quite a bit.  I
> guess I am simply more comfortable with this approach, and I think it
> will be easy to adapt the Genbank and Swissprot parsers using this
> approach.

What?  You deviated from my Sacred Style?  Blasphemy! :-)

Looks just fine to me, really.  Cool - I think we're onto an
overall winner for the general parsing system structure.

> Some questions/suggestions:
> - Can we rename the parser include file to delete the '_class' again?
> Those names get so long....

I've got no problem with that - appending "_class" is just a convention
I picked up on my own to distinguish files containing classes from those
containing e.g. random functions or regular scripts, but really I think
the existing .inc.php serves the same general purpose.  I say go for it.

One minor change we might think about doing soon is changing the class names 
slightly - the PEAR standards say that class names should always start with a
capital letter.  Not that we necessarily have to stick to that, but it might
be worth doing while there are relatively few classes to deal with.

> - Can we pass the data to the parsers by reference?  That can save quiute
> a bit of copying (need to remove that unset command in the parse
> constructor)

Hmmm, don't see why not.  That'd reduce peak memory usage a bit further, 
and slightly improve readability while it's at it.  Works for me.

> - Do we need to thinnk about closing files? 

Also a good idea, I think.

I'll go back and implement get-by-reference and "close the file handle if
there is one when you hit the last record" in the Fasta and Clustal parsers
and commit them tomorrow morning.