Interesting idea Gilbert. I tried the rsync's first on our server: time rsync rsync://bio-mirror.net/biomirror/blast/env_nr.tar.gz . -au real 0m58.704s user 0m0.735s sys 0m1.009s Then with --no-whole-file: time rsync rsync://bio-mirror.net/biomirror/blast/env_nr.tar.gz . -au --no-whole-file real 0m0.847s user 0m0.002s sys 0m0.004s With wget: real 0m49.270s user 0m0.320s sys 0m1.506s Then wget with --mirror: real 0m1.085s user 0m0.004s sys 0m0.001s rsync seems to be faster with --no-whole-file that wget with --mirror. I'm going to have to check exactly what the --mirror does with wget. If it only downloads changes, or downloads the entire file again. Thanks for telling me about this! Don Gilbert said: > > Jeremy, > > This may be a common misunderstanding of the value of rsync. rsync's > 'delta transmission' of changed records comes at a high cost of disk > file checksumming: basically your computer checksums all the blocks of > each file, sends to rsync server, which does same file checksum, and > then sends only changed blocks. This reduces network transport, but > at cost of lots of disk access and CPU computation (on both server and > client). > > For a busy data server, rsync costs much more time (in disk, cpu use) > and actually gets the file to you more slowly than simple but robust > FTP. > > Rsync typically uses a full CPU for minutes / file, while FTP is very > lightweight on the server. IUBio/Bio-mirror can support multiple FTP > processes from one client (I recommend no more than 15). Using > multiple Rsync processes from the same client is a not-nice thing due > to high cpu/disk cost to server. > > Try this test with a 100+ MB file: > /usr/bin/time rsync rsync://bio-mirror.net/biomirror/blast/env_nr.tar.gz > . -au > 51.51 real > touch -t 200109110825 env_nr.tar.gz > /usr/bin/time rsync rsync://bio-mirror.net/biomirror/blast/env_nr.tar.gz > . -au --no-whole-file > 58.87 real 2.53 user 1.52 sys > > /usr/bin/time wget -nv -nH --mirror > ftp://bio-mirror.net/biomirror/blast/env_nr.tar.gz > 7.30 real 0.04 user 1.56 sys > touch -t 200109110825 biomirror/blast/env_nr.tar.gz > /usr/bin/time wget -nv -nH --mirror > ftp://bio-mirror.net/biomirror/blast/env_nr.tar.gz > 6.61 real 0.02 user 1.72 sys > > (my times are on local GB ethernet ) > -- Don > -- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405 > -- gilbertd at indiana.edu--http://marmot.bio.indiana.edu/ > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters > -- Jeremy Mann jeremy at biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672