Nico, Just a quick note... yes, I got your attachment. Will go through it this weekend. As for the seq and seqdb codes, yes, the parser will have to be somehow attached to seq, if we are to do a "quick or express parse" of any data file. As for the fate of seqdb, let's ask the opinion of others first before "axe-ing it". Admittedly, it's a lot easier importing the parsed data into a relational database like MySQL or PostgreSQL than writing our own flat file database managment system. Where do you think we should put such code (that stores parsed data into a MySQL database)? I am not that comfy about making it an official part of GenePHP/BioPHP as it reflects a particular database structure/design. (See the sample MySQL database schema in the GenePHP site). >> function parse_ANSI () > >(Shouldn't that be "ASN.1"?) Yes, that should be ASN.1. Andres and I stand corrected. =) Lately, I've been busy writing scripts that actually do something useful like translating proteins in all six reading frames, reverse translating a protein into its nucleic acid counterparts, etc. While it's admittedly time-consuming, I am LEARNING A LOT about what needs to be done with the existing code. I've posted those demo scripts at http://genephp.sourceforge.net/applist.html. Kurt: Still haven't touched your code. Been busy lately (see above paragraph). I've been to the Vector NTI (Infomax?) site but I couldn't find any formal definition or specification of their molecule document format, which according to you, is supposed to be a superset of GenBank. My only other concern here is, given Nicos' suggestion of having a function that "AUTO-DETECTS" a file, how would we then distinguish a Vector NTI file from a GenBank file (given they have a lot of similarities)? Sean: Where can I get/try out your eFetch code? I've visited your site but it says there "NO FILES AVAILABLE". Am I missing something? Regards, Serge -- On Fri, 25 Apr 2003 09:59:52 mail-lists+biophpdev wrote: >On Friday 25 April 2003 09:19 am, nicos@itsa.ucsf.edu wrote: >> [...]I enclose a tar.gz file with the >> code so that you can have a look (don't know if it makes it through teh >> mailing list, not a good idea to include a file, but....) > >Postings over 40k currently "pause" in the queue with a message to the list >administrator, who can approve or reject it. I just approved it, naturally... > >I like this idea, though the individual format parsers might end up >being classes themselves (and "enclosed" within the "wrapper" parser >class). > >Of course, the really difficult part may be: >> function autodetect () // figures out what seqfiletype this file is, > >Then again, that depends on how many different formats we want to be >able to auto-detect. It may also be worthwhile to have "forced" format >parsing enabled (e.g. the ability to directly call a particular parser without >going through auto-detection, in case auto-detection proves problematic >for some formats). >> function parse_ANSI () > >(Shouldn't that be "ASN.1"?) > >I have FASTA and Clustal (.aln) parsers in the module code section already, if >those are helpful at all. > >> The SQL stuff could be made self-contained in a similar fashion. I would >> strongly advice though to stop using the direct MySQL calls but instead >> immediately start using a database abstraction layer like adodb (my >> favorite, I can help out with this one) or PEAR (might finally be usable). > >I would personally vote for PEAR, mainly to minimize dependencies on >"non-default" components. Not that I would MANDATE it, even if I thought >I could get away with it... > > >_______________________________________________ >Biophp-dev mailing list >Biophp-dev@bioinformatics.org >https://bioinformatics.org/mailman/listinfo/biophp-dev Need a new email address that people can remember Check out the new EudoraMail at http://www.eudoramail.com