Hi Serge (and others) > Just a quick note... yes, I got your attachment. Will go > through it this weekend. As for the seq and seqdb codes, > yes, the parser will have to be somehow attached to seq, > if we are to do a "quick or express parse" of any data > file. As for the fate of seqdb, let's ask the opinion > of others first before "axe-ing it". I did not mean at all to get rid of it. Rather, some of the functionality now in there could be moved to separate, smaller, classes (like class parse and class sql) that can be used by seqdb as well as other scripts. I do think there is a place for a flat file database system like you have in seqdb, it just should not be the only way to use the code. > Where do you think we should put such code (that stores > parsed data into a MySQL database)? I am not that comfy > about making it an official part of GenePHP/BioPHP as it > reflects a particular database structure/design. (See > the sample MySQL database schema in the GenePHP site). You could make a class sql and distribute it with a script that generates the needed tables. It would also need to know the database server, database name, username and password. These things could live in a biophp configuration file. >>> function parse_ANSI () >> >> (Shouldn't that be "ASN.1"?) > > Yes, that should be ASN.1. Andres and I stand corrected. =) > Sorry, a little bit of dyslexia here... > Lately, I've been busy writing scripts that actually do > something useful like translating proteins in all six > reading frames, reverse translating a protein into its > nucleic acid counterparts, etc. While it's admittedly > time-consuming, I am LEARNING A LOT about what needs to > be done with the existing code. I've posted those demo > scripts at http://genephp.sourceforge.net/applist.html. > Sounds cool. Can this be made part of class seq? > Kurt: Still haven't touched your code. Been busy lately > (see above paragraph). I've been to the Vector NTI > (Infomax?) site but I couldn't find any formal definition > or specification of their molecule document format, which > according to you, is supposed to be a superset of GenBank. > > My only other concern here is, given Nicos' suggestion of > having a function that "AUTO-DETECTS" a file, how would > we then distinguish a Vector NTI file from a GenBank file > (given they have a lot of similarities)? Don't know, depends on the exact file format. It is probably best to add the fileformat as an (optional) parameter to the constructor of class parse. That way we can postpone the autodetection until we have a bunch of parsers written and documentation for the various fileformats. If you all don't mind I will put some work into coding the framework for class parse. >> >> I have FASTA and Clustal (.aln) parsers in the module code section >> already, if >> those are helpful at all. Sean, to what do your FASTA and Clustal parsers parse? To Serge's seq objects? If not, I am not sure how we can use them. B.t.w. now is probably a good time to look carefully at the seq object (I did not do that), since lots of future work will depend on it. >> >>> The SQL stuff could be made self-contained in a similar fashion. I >>> would >>> strongly advice though to stop using the direct MySQL calls but >>> instead >>> immediately start using a database abstraction layer like adodb (my >>> favorite, I can help out with this one) or PEAR (might finally be >>> usable). >> >> I would personally vote for PEAR, mainly to minimize dependencies on >> "non-default" components. Not that I would MANDATE it, even if I >> thought >> I could get away with it... I agree with that idea, I have simply much more experience with adodb (and I simply distribute it with the phplabware project, people downloding it probably are not even aware thay are using it). The choice is up to the person writing the sql interface.. Best, Nico Nico Stuurman Vale Lab HHMI / Dept. of Cellular and Molecular Pharmacology University of California, San Francisco Genentech Hall, Room N316 600 16th street For mail: San Francisco, CA 94143-2200 For deliveries: San Francisco, CA 94107 email: nicos@itsa.ucsf.edu phone: (415) 514-3927 fax: (415) 476-5233