[Bioclusters] Nomenclature (was Re: Call for information.)

Octavio Martinez de la Vega bioclusters@bioinformatics.org
Thu, 18 Apr 2002 13:43:39 -0500

Hi Ivo,

I am begining to tune-up a cluster for blast (we will be blasting ESTs to nt, 
nr and est_others so all my tests are to the three DB).

> - the hardware platform,

A cluster of 16 slave dual P3 at 1.13Gh, plus a "master" dual P3 at the same 
speed. All machines will have 4Gb of RAM, but at the moment we are having 
problems with the memory (it is very hot and some nodes are going down).
The cluster is interconnected with a 1Gb switch.

the software:
We are running linux (RedHat) and OSCAR including of course PBS.

So far I have found that the optimum parameters for my system are:

1st. (and most important) To group the sequences in packets of 6 sequences 
(we will be receiving 96 sequences from the sequencer each time, so 96/6=16 
equals the number of nodes, do it nicelly). I found looking at the cluster 
with top and ganglia that the "latency" times (times when the CPU in each 
node are idle) are very large -even when the switch is 1Gb!  (I do not have 
any idea why), so packing the sequencess cut the wall time to less than half.

2nd. I instruct PBS to use one node with both processors:
(#PBS -l nodes=1:ppn=2) and give maximum priority to the process:
(#PBS -p +1023).

3rd. I send the blast with "nice -20" and order blast to use two (virtual?) 
processors: "-a 2". I found that giving more "processors" make is slower 
(with -a>2 blast just open new processess in the node but that does not help 
with wall clock speed).

We are still in testing, and I am writing the scripts to enter into 
production soon; the time of the whole process seems very good but any 
sugestions or comments will be very much welcome.

If somebody is interested I can send my test script (with comments in Spanish 
I am afraid).

If somebody has a "standar" set of sequencess it will be great to begin to 
compare times... 


PS. I apologise for the spelling mistakes (I do not have an English speller 
in this machine).

Dr. Octavio Martínez de la Vega
Profesor Titular
Departamento de Ingeniería Genética
CINVESTAV - Unidad Irapuato.
Km. 9.6 Libramiento Norte,
Carretera Irapuato-León
Apartado Postal 629
CP 36500 Irapuato, México.
Tel directo: 462 39649
Conmutador: 462 39600

PS. I apologise for the spelling mistakes (I do not have an English speller 
in this machine).

> Could -- in the meantime -- some other people from the list just
> publish their "small and not complete and not systematic" benchmark
> results?  All of those results together could give (all of) us a better
> picture than each individual test alone.  Just specify in detail
> - the hardware platform,
> - the compiler & compiler options, and
> - the Blast parameters (which set of query sequences?, which database?,
> repeat-masked or not?, which E-value?, etc.).
> That would be super.  Thanks!
> Ivo