[Bioclusters] blast and nfs

Duzlevski, Ognen bioclusters@bioinformatics.org
Mon, 21 Apr 2003 13:58:58 -0500


Hi all,

we have a 40 node cluster (2 cpus each) and a cluster master that has =
attached storage over fibre, pretty much a standard thingie.

All of the nodes get their shared space from the cluster master over =
nfs. I have a user who has set-up an experiment that fragmented a =
database into 200,000 files which are then being blasted against the =
standard NCBI databases which reside on the same shared space on the =
cluster master and are visible on the nodes (he basically rsh-s into all =
the nodes in a loop and starts jobs). He could probably go about his =
business in a better way but for the sake of optimizing the setup, I am =
actually glad that testing is being done the way it is.

I noticed that the cluster master itself is under heavy load (it is a 2 =
CPU machine), and most of the load comes from the nfsd threads (kernel =
space nfs used).

Are there any usual tricks or setup models utilized in setting up =
clusters? For example, all of my nodes mount the shared space with =
rw/async/rsize=3D8192,wsize=3D8192 options. How many nfsd threads =
usually run on a master node? Any advice as to the locations of NCBI =
databases vs. shared space? How would one go about measuring/observing =
for the bottlenecks?

Thank you,
Ognen