[Bioclusters] MPI clustalw

James Cuff bioclusters@bioinformatics.org
Thu, 6 May 2004 21:30:49 -0400


On May 7, 2004, at 11:36 AM, Michael Cariaso wrote:

> After seeing this abstract
>   http://bioinformatics.oupjournals.org/cgi/content/abstract/20/7/1193
> about a newly released MPI version of clustalw, coming out of of  
> University of Western Australia
> I was wondering of anyone had any experience to compare it to this  
> version
>   
> http://bioinformatics.oupjournals.org/cgi/content/abstract/19/12/1585? 
> ijkey=12ACOyqyrKpFz&keytype=ref
> coming from singapore?

So here's my two pence on this.

I never really got the whole benefit of mpi/pvm speed up deal for  
compute farms.  There I said it, I'm sorry.  :-)

IMHO it takes away the flexibility of the embarrassingly parallel  
nature of clusters.  With any of these approaches, you make hardware  
failure a critical component of your environment.

However, and it's a real big however - you can turn around results  
really, really fast with these setups.  They are great for providing  
rapid sub second turn around, where you need it for single jobs.  For  
example web servers and services where you need rapid response for  
single, or small numbers of jobs.

Other than that, if you are pipelining huge numbers of jobs you are  
still better off running 1,000 copies of clustal/blast/hmmer than  
passing messages about the place.  See Amdahl for further details, keep  
q close to 0.95... etc. etc.  Even on smaller clusters, you still win,  
I like 8 cpus to work as 8, not 7.5...

Anyway, both these articles are great, and I'm sure the code works just  
fine.  I used an early version of the SGI // clustal back at the EBI,  
and it was splendid, again it was for a web server, so it was ideal.

I'd like to see how this stuff runs on a GB ethernet interconnect, but  
again it is not the bandwidth, it is always the latency that kills  
you...

Guy/Tim - did you ever deploy that HMMer PVM cluster we talked about  
for the Pfam web site?

j.

--
James Cuff, D. Phil.
Group Leader, Applied Production Systems
The Broad Institute. 320 Charles Street, Cambridge,
MA. 02141-2023.  Tel: 617-252-1925  Fax: 617-258-0903