[Biophp-dev] Re: New question from the Newbie

biophp-dev@bioinformatics.org biophp-dev@bioinformatics.org
Tue, 23 Mar 2004 11:47:03 +0100


Thanks SEAN,

I have to apologize once again cause of the following question =
concerning the
piece of code available at the beginning of the file
/parsers/swissprot.inc.php :
what is the meaning of the & in &source I never used it before. Thanks =
in
advance .

Fred

//################class constructor###################

    function parse_swissprot(&$source)=20
    {
        if($source !=3D "") {
            $this->setSource($source);
        }
    }

-----Message d'origine-----
De : biophp-dev-admin@bioinformatics.org
[mailto:biophp-dev-admin@bioinformatics.org]De la part de S Clark
Envoy=E9 : lundi 22 mars 2004 18:58
=C0 : biophp-dev@bioinformatics.org
Objet : Re: [Biophp-dev] Re: New question from the Newbie


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm not Nico, but...

The swissprot.inc.php class handles the 'loading the data into the =
array'
internally.  If you're reading from a file, just pass the filename
(or the file handle, if you've already opened the file at this point)
to the class, like:

$importer =3D new=20
parse_swissprot("/home/frederic/swissprot/swissprot-data.swp");

See the setSource() and readfromFile() methods for the actual steps
that this parser takes to pull the lines of data into the internal
array...

(See also the readRecord() 'wrapper' method that decides what methods to
call based on whether the data given to swissprot parser was a pre-read
string of text or a file)

I think most of the parsers work this way - if given a file rather than
text data, the parser will read and process one record at a time from =
the
file rather than loading the entire file into memory, hence the need for
a branch in the code between file reading and text reading.  This is =
just
to make it possible to read realy big files (think the extreme example =
of=20
someone wanting to download the entire Genbank database and read through =
it,=20
saving data from only certain types of records that match - who'd want =
to=20
load 3GB of text into their system's memory before they could start =
parsing?)

but doing this isn't absolutely necessary (and in some cases is =
impossible -=20
clustal files have the sequence data interleaved, so you HAVE to read =
the
entire file into memory before you can get any of the complete sequences
parsed from them.).

If locuslink files are never going to be very large, it's probably =
easiest
just to go ahead and 'internally' have the locuslink parser read the =
entire
file into memory and just work directly from the text.  (See the clustal
parser for an example of this).

Sean

On Monday 22 March 2004 09:53 am, Frederic.Fleche@aventis.com wrote:
> Hi Nico,
>
> The line was the #166 of the file parsers/swissprot_inc.php that you =
talk
> about on your 19 March e-mail.
> So if $sourcelines is an array, I understand how that loops works but =
I
> don't understand where is the step that put all my swissprot file =
(cause I
> use a file) in this array. So if you could tell me the file and the =
line of
> this step it would be great.
>
> Thanks a lot.
>
> Fred
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.9.5 (GNU/Linux)

iD8DBQFAXykYJ6yQLhNTzSkRAu2uAJ9pcUq7NJYuBv5HiptdgEz0jvTNsACeJQwB
im16tKct0x72kL6hKASD5II=3D
=3Dl+4z
-----END PGP SIGNATURE-----
_______________________________________________
Biophp-dev mailing list
Biophp-dev@bioinformatics.org
https://bioinformatics.org/mailman/listinfo/biophp-dev