[Bioclusters] Re: blast and nfs

Bruce O'Neel bioclusters@bioinformatics.org
Thu, 24 Apr 2003 16:55:24 +0200 (MEST)


Hi,

I thought that I'd emphasize a few things that Chris and Joseph have
already said.

Except for a few small subfields, scientific computing tends to be i/o
bound.  As already pointed out, feeding a lot of data through what is
basically a fast serial connection is a bad idea.  If you use 100
megabit ethernet you max out somewhere around 40 megabits or so
because you can't use the full channel bandwith.  This is somewhere
around 4 or so megabytes per second, which most of you will recognize
is way below the low end of one hard disk.  Things only improve by a
facter of 10 or so if you use gigbit ethernet so that doesn't really
save you there either.

That, combined with modern OSs hard work to cache disks well, and then
combined with cheap IDE hard disks, means that it almost always is a
win to put your data locally.  Using disk striping helps even more but
may not always be necessary and should be tested.

NFS is good for things like login directories where you read small
files once or twice and for source code repositories where you don't
keep re-reading the files.

NFS is very bad for big files since (basically) every 8k bytes or so
requires the file to be reopened on the server, then you have to seek,
then 8k bytes is read, and then closed again.

To make things worse some labs then do the incremential aproach to
NFS, where as you add each system the spare disk space on that system
is dedicated to something, and then mounted on all other systems.
This is very bad since then for most work to happen ALL systems have
to be up and functioning.  Plus you end up with NFS traffic all over
your network.  It does keep your switch busy though :-)

Far better is to have a central NFS server for all of your home
directories, and then have your central archives
mirrored/rsynced/whatever to your different compute nodes.

Of course, your mileage may vary since each lab is different.

cheers

bruce

-- 
.. there is no area or function that someone can't try to put together
with bubble gum and bailing wire. -- Strata Chalup

Bruce O'Neel                       phone:  +41 22 950 91 57
INTEGRAL Science Data Centre               +41 22 950 91 00 (switchb.)
Chemin d'Ecogia 16                 fax:    +41 22 950 91 35
CH-1290 VERSOIX                    e-mail: Bruce.Oneel@obs.unige.ch
Switzerland                        WWW:    http://isdc.unige.ch/