[Bioclusters] ack. I'm getting bitten by the 2gb filesize problem on a linux cluster...

Jeffrey B. Layton bioclusters@bioinformatics.org
Thu, 30 Jan 2003 16:10:55 -0500


chris dagdigian wrote:

> Hi folks,
>
> I thought these problems were long past me with modern kernels and 
> filesystems --
>
> We as a community have learned to deal with uncompressed sequence 
> databases that are greater than 2gb -- its pretty simple to gzcat the 
> file and pipe it through formatdb via STDIN to avoid having to 
> uncompress the database file at all.
>
> Now however I've got a problem that the compressed archive file that 
> someone is trying to download is greater than 2gb in size :)
>
> The database in question is:
>
> ftp://ftp.ncbi.nlm.nih.gov/blast/db/FormattedDatabases/htgs.tar.gz
>
> The file is mirrored via 'wget' and a cron script and has recently 
> started core dumping. A ftp session for this file also seemed to bomb 
> out but I have not verified this fully.
>
> I did the usual things that one does; verified that the wget binary 
> core dumps regardless of what shell one is using (Joe Landman found 
> this issue a while ago...). I also verified that the error occurs when 
> downloading to a NFS mounted NetApp filesystem as well as a local ext3 
> formatted filesystem.  The node is running Redhat 7.2 with a 
> 2.4.18-18.7 kernel.
>
> Next step was to recompile 'wget' from the source tarball with the 
> usual  "-D_ENABLE_64_BIT_OFFSET" and "-D_LARGE_FILES"  compiler 
> directives.
>
> Still no love. The wget binary still fails once the downloaded file 
> gets a little larger than 2gb in size.


What shell are you running? What filesystem? (was it built under a 2.2 
kernel?)

Jeff

>
>
> Anyone seen this before? What FTP or HTTP download clients are people 
> using to download large files?
>
> -Chris
>
>
>
>