chris dagdigian wrote: > Hi folks, > > I thought these problems were long past me with modern kernels and > filesystems -- But of course not!!! > We as a community have learned to deal with uncompressed sequence > databases that are greater than 2gb -- its pretty simple to gzcat the > file and pipe it through formatdb via STDIN to avoid having to > uncompress the database file at all. Sorta, just you are writing to a pipe handle instead of a file handle, but you don't need the disk space for the uncompressed file... in theory. > Now however I've got a problem that the compressed archive file that > someone is trying to download is greater than 2gb in size :) ... and .... > The database in question is: > > ftp://ftp.ncbi.nlm.nih.gov/blast/db/FormattedDatabases/htgs.tar.gz > > The file is mirrored via 'wget' and a cron script and has recently > started core dumping. A ftp session for this file also seemed to bomb > out but I have not verified this fully. > > I did the usual things that one does; verified that the wget binary core > dumps regardless of what shell one is using (Joe Landman found this > issue a while ago...). I also verified that the error occurs when Most of the shells are not compiled with the LFS options set by default (I dont know if this has changed....) I have taken to (defensively) recompiling the shell by hand. > downloading to a NFS mounted NetApp filesystem as well as a local ext3 > formatted filesystem. The node is running Redhat 7.2 with a 2.4.18-18.7 > kernel. RH7.2 had a few problems with large files. The shells needed re-compilation. As did a few tools. > Next step was to recompile 'wget' from the source tarball with the usual > "-D_ENABLE_64_BIT_OFFSET" and "-D_LARGE_FILES" compiler directives. > Still no love. The wget binary still fails once the downloaded file gets > a little larger than 2gb in size. > > Anyone seen this before? What FTP or HTTP download clients are people > using to download large files? Ok, the usual suspects . shell . some library wget is using (do an ldd /usr/bin/wget) . wget itself (using an int or a long for byte counter or for a seek or ...) I'll try something here. I assume they are doing wget url and not wget -O - url | some_other_command or wget --output-file=- url | some_other_command Joe > > -Chris > > > > -- Joseph Landman, Ph.D Scalable Informatics LLC, email: landman@scalableinformatics.com web : http://scalableinformatics.com phone: +1 734 612 4615