[Bioclusters] parallel blast???

Chris Dagdigian bioclusters@bioinformatics.org
Mon, 16 Sep 2002 20:36:43 -0400

Be careful with your benchmarks as they can be meaningless or 
misleading. You will find that the speed of blast distributed within a 
cluster or compute farm is directly related to 2 things: (a) the amount 
of physical memory in the compute  nodes and (b) the speed of your 
storage or disk I/O system.

You can have the fastest server on earth but if you searching with 
blast against an NFS mounted database and your network or fileserver is 
slow then your blast searching speeds will be horrible. Give me a small 
number of speedy linux boxes and I can bring a $300,000 NFS/NAS system 
to its knees. Storage does matter.

Blast performance also depends on you tune your DRM (gridengine or LSF 
etc. etc.) and how  you adjust your workflow with respect to splitting 
large databases, locally caching data on compute nodes etc. etc.

What are you trying to benchmark for? Picking the right CPU? Some 
people on this list may have already done this. My personal preference 
is Intel Pentium III's right now because:

o P IV's are way too expensive
o P III's are dirt cheap
o There are a ton of dual-CPU motherboard options for the PIII allowing 
me flexible choices of system packaging and vendor
o Athalon / AMDs are super fast but your motherboard choices are 
limited and you need to be really  careful about cooling and ventilation


On Monday, September 16, 2002, at 07:24 PM, Romualdo Zayas Lagunas 

> Hello everyone,
> I am part of a computational genomics team at CIFN-UNAM in Mexico.
> Currently, we are trying to purchase a cluster (Linux and 32 or 48
> dual nodes), but since we lack experience in the field we
> would like to perform some tests on some clusters first. Can you give 
> me
> any pointers to URLs or any resources where I can download parallel
> blast or scripts that run blast in parallel (and its different command
> line options)?
> I will really appreciate any help you can give me.
>   Thanks a lot in advance
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> M.C. Romualdo Zayas Lagunas
>                   CIFN-UNAM
>         rzayas@cifn.unam.mx
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters