=2D----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Depending on how you count, there are either 2 or 3 'modules' that all go together to make up the import capabilities. seqIOImport itself the 'specific file type' parser module and (depending how you count) seqFactory. There's very little that needs to be done to seqIOimport (and nothing for seqFactory) to add a new import module. There are only a couple of requirements to fit a 'specific file type' module (such as the locuslink parser you are working on)=20 1.)the class needs to be able to accept a file name, a file handle, or "text" (which, I suppose, could actually be binary data) as an input source. (this is so that we can handle data from a network socket connected to a server, http:// or ftp:// URL's, files on the local hard drive, or data already read into memory from other sources) 2.)the class needs to accept the input source on instantiation (i.e. $parser =3D new locuslink_import($input_source) ) 3.)the class SHOULD have a "setSource()" interface (which sets or changes the input source - seqIOimport doesn't currently use this, but it could in the future - i.e. for parsing multiple files in one shot). 4.)the class MUST have a fetchNext() interface, which returns an associativ= e=20 array with the next parsed sequence data. (e.g. 'id'=3D>'(name of=20 sequence)','sequence'=3D>'ACGTACGTACGT...') ) We're using this type of=20 'generic' associative array as a format for exchange sequence information between modules so as to make the individual modules usable by themselves (i.e. you can use the fasta parser module all by itself [outside of=20 seqIOimport] without knowing anything about the seq class format...) 5.)When imported into the BioPHP framework, it goes into the 'parsers' section, named (filetype).inc.php (e.g. "swissprot.inc.php"). =20 That last requirement is just so that it can be found and auto-loaded by the seqIOimport module. seqIOimport is only a 'go-between' - it handles (where possible)=20 auto-detection of filetypes and calling of the appropriate parser, and acting as a frontend to the parsed sequence data (it can either return the 'raw' associative array results from the 'filetype' parsers, or it can pass the data to 'seqFactory', which is in charge of generating seq objects from the data.) Adding a new filetype parser to seqIOimport takes only one to three additio= nal steps: 1.)REQUIRED - add the name of the filetype (e.g. 'locuslink') to the list of recognized filetypes. ( $this->seqfiletypes=3Darray('fasta','clustal','lasergene','pdraw','genban= k','swissprot'); ) 2.)OPTIONAL (but desirable) - add the 'file extension' to the 'detect filet= ype by filename' feature, if applicable (the typeByName($name) method) 3.)OPTIONAL (but desirable) - and add pattern of the first line of data by which seqIOimport can recognize the type of data (the autodetect() method) Everything's been designed as much as possible so far such that each=20 individual component needs to know only the barest minimum about the other components - seqIOimport only needs to know 'call the filetype parser with the data source' and 'call fetchNext() to get the next sequence', (and= to=20 call seqFactory to generate sequence objects) and that's it. The filetype parser only needs to know it's getting a data source on instantiation, and that it needs to respond to 'fetchNext()' with the next parsed sequence's=20 information. seqFactory only needs to know that it's getting an associative array (and what common terms will be in the array) and how to feed that info to the seq object. It's hoped that this will make it very easy for people to pop in and=20 contribute (in this case) import modules, since you don't need to 'learn' t= he=20 rest of the modules to do so. Does any of this help?.... P.S. to answer your SPECIFIC question - $flines is the data read from the source passed to the swissprot parser - the swissprot parser has no knowledge at all of the existence of the seqIOimport module that loads it (and, indeed, might conceivably be called directly in a script rather=20 than through the seqIOimport 'wrapper'). (I note that the version of the parser that I'm looking at reads: while ( list($no, $linestr) =3D each($sourcelines) ) { so you probably do have a slightly older version. I've probably mangled this whole explanation, so please feel free to ask me what the heck I mean :-) Sean On Friday 19 March 2004 03:24 am, Frederic.Fleche@aventis.com wrote: > Hello all, > > I am planning to do a locuslink-file parser. > So I read the swissprot parser in order to get some good ideas. > Since my knowledge in php is not as good as yours I have a newbie question > concerning the following line of the function parse_swissprot > > while (list($no, $linestr) =3D each($flines)) > > if $flines is from $seqIOimport->flines, I understand cause it is an array > > if $fines is from $seqIOimport->fp, I don't understand cause it is a file > handle or does it work in the same way ? =2D----BEGIN PGP SIGNATURE----- Version: GnuPG v1.9.5 (GNU/Linux) iD8DBQFAXToPJ6yQLhNTzSkRAnkUAKCvpA7cqQDaMnm0sJFZ4RX1lQ42ZACdFtE6 Kv1WWSpIElN2YxreLYT5avc=3D =3DCT1a =2D----END PGP SIGNATURE-----