[Bioclusters] Taking a poll

Wed Jun 22 04:48:53 EDT 2005

On 21 Jun 2005, at 11:18 pm, Glen Otero wrote:

>
> 1) What is the size of your Linux cluster? (e.g. 128 nodes)

1,237 nodes

> 2) What is the compute node architecture? (e.g. dual 3GHz Xeon,  
> dual Opteron)

Various:

767 x 800 MHz Pentium III
168 x dual 2.8 GHz Xeon
280 x dual 3.2 GHz Xeon EM64T
20 x quad Alpha (various speeds)
1 x 4 CPU 1.5 GHz Altix
1 x 16 CPU 1.6 GHz Altix

A smattering of other oddballs.

All of these machines run in a single LSF cluster.

> 3) Do your compute nodes have local hard drives or are they diskless?

All have local drives.

> 4) What is the cluster interconnect? (e.g. GigE)

GigE for the dual CPU blades.
Fast ethernet for the Pentium III blades
GigE and MemoryChannel for the Alphas
SGI NUMALink for the Itanium boxes.

> 5) What application(s) primarily run on your cluster (e.g. BLAST,  
> HMMER)

BLAST, HMMER, Genewise, R, Exonerate, RepeatMasker, blat, SSAHA

> 6) If BLAST is running on your cluster, describe the type of BLAST  
> jobs (e.g. blastn- genome vs. genome, blastn-genome annotation with  
> ESTs)

Mostly mapping of ests and proteins to genomes for Ensembl.  Genome- 
genome comparisons are done with BLAT.

> 7) What type of cluster filesystem is in use on your cluster? (e.g.  
> NFS, GFS, proprietary)

The Pentium III machines all use NFS (indeed, all the machines have  
some NFS access)
The Alphas use Tru64 TruCluster's filesystem.
The dual CPU IBM blades currently use IBM GPFS, but we are also  
working on deploying Lustre.

> 8) If you use NFS, is your NFS server the same as the cluster head  
> node or is it a separate server?

The same.  The Alpha clusters are the head nodes, and the primary NFS  
servers.

> 9) If you use NFS, what type of machine is your NFS server? (e.g.  
> dual Xeon Linux Box w/ x number of yGB hard drives)

4-8 node quad CPU Alpha TruClusters, plus a couple of SGI Altixen.

> 10) What is the ratio of NFS servers to compute nodes in your cluster?

Depends on how you measure it, because the TruClusters appear to be  
single hosts, even though they all share the NFS load.  There are 22  
primary NFS servers, but if you don't count the TruCluster nodes as  
separate, this drops to 5.

> 11) How do you benchmark the throughput of your NFS server, i.e.  
> what application(s) do you use to stress the NFS server and what  
> tool(s) do you use to measure throughput?

We don't benchmark it.  We advise users to avoid accessing data over  
NFS whenever possible.  The compute nodes have quite limited access  
to NFS.

> 12) If you have a NAS/SAN, what type of machine(s) are they? (e.g.  
> NetApp w/ x number of yGB hard drives)

We have several HP StorageWorks HSV110, and I have lost count of the  
number of spindles; they vary capacity from 72GB spindles up to  
300GB.  Currently we have about 20 TB of SAN storage coupled to this  
cluster.  The rest of Sanger's SAN is also available over NFS to some  
of the nodes, and that's pushing half a petabyte total.

The local disks on the dual CPU blades are donated to GPFS  
filesystems, totalling about 11 TB of GPFS storage.

> 13) What is the ratio of NAS servers to compute nodes in your cluster?

Again, depends on how you count it, but it's probably in the region  
of 60:1.

> 14) What is the throughput of your NAS server(s)/SAN?

What do you mean throughput?  How many MB/sec they can sustain?  40MB/ 
sec is common.

> 15) How do you benchmark the throughput of your NAS server/SAN,  
> i.e. what application(s) do you use to stress the servers and what  
> tool(s) do you use to measure throughput?

bonnie++

> 16) Are you satisfied overall with your cluster filesystem solution?

Yes.

> 17) What are the two biggest problems you have with your cluster  
> filesystem solution?

1.  Putting it back together when it falls over.
2.  Binary-only kernel modules are evil.

> 18) With regard to cluster filesystems, what would you do  
> differently when building your next cluster? (e.g. increase NFS  
> servers/compute node ratio, different filesystem)

Get rid of NFS altogether, if at all possible.  It's the single  
biggest cause of cluster failure, bar none.  We're also quite happy  
building GPFS filesystems just using the local drives on the  
machines.  It's cheap, and the performance is quite satisfactory.

> 19) What NFS server/compute node or NAS-SAN/compute node ratio will  
> you aim for with the next cluster you build or upgrade?

We don't have enough experience yet with our newest hardware to re- 
evaluate that.

> 20) If necessary, do you mind if Glen emails you offline to seek  
> clarification on some of your answers? (Answering "Yes" is not  
> necessary to be eligible for a gift certificate).

Feel free.

Tim

-- 
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233