[Bioclusters] Re: Recommendations for building a cluster

Sat Oct 20 18:02:45 EDT 2007

On Oct 19, 2007, at 4:23 PM, Ahmed Moustafa wrote:

> Thank you so much Chris!
>
> By "genome-wide", it is more like performing the same task (e.g.  
> blast, clustalw and phyml) for every single gene in a genome. But  
> because a single task could include +200 sequences to align or to  
> build a phylogeny, it takes a significant amount of time and memory  
> to process a single gene. So by using, for example, mpiblast and  
> raxmlmpi, it could be possible to distribute these tasks over a  
> cluster and finish an analysis in a reasonable time.
>
> I have googled and found these options Dell PowerEdge SC1435  
> (http://www.dell.com/content/products/productdetails.aspx/ 
> pedge_sc1435), Sun Fire X2100 M2 (http://www.sun.com/servers/entry/ 
> x2100/) and Mac Cluster (http://www.apple.com/science/solutions/ 
> workgroupcluster.html), do you have an experience with any of these  
> options?
>
> Thanks again!
>
> Ahmed

our cluster and ngdc (http://ngdc.noaa.gov/dmsp) is on about 20 dell  
poweredge boxes.  we used rq

   http://codeforpeople.com/lib/ruby/rq/
   http://www.linuxjournal.com/article/7922

(which i am the author of) to run the cluster on redhat linux.  using  
stock redhat and rq you should be set up a 20-30 node cluster in  
under an hour.  using rq assumes you have an nfs box to store code  
and jobs on and that you can break your task up into discrete  
processes, but this is often the case.

anyhow - we're going on about 3 years of 24x7 process on our cluster  
with zero issues other than an occasional smoked disk.

kind regards.

a @ http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being  
better. simply reflect on that.
h.h. the 14th dalai lama