Hi, I am a newbie and not a bio guy, but rather a computer science guy. I am interested in why you are seeing these kind of horrible non-linear Scale-out numbers. Please excuse my ignorance, but is the problem that there are dependencies between jobs and this doesn't scale? What exactly is the relationship of this problem to MPI? I just have to wonder if figuring out a better way to divide and conquer has any merit. I am interested in y'alls feedback as my company is working on a Windows .NET based Grid solution and we want to focus on the bioinformatics community. It seems to me that a lot of researchers spend time worrying about getting faster results (understandably), however it doesn't seem like there is much in the way of cycle-stealing grid software solutions that are flexible, secure, and easy to use. I want to know what is missing currently to get these faster results reliably, despite hardware faults, etc. I am aware of Condor (free), DataSynapse, Platform Computing, and others. I am interested in knowing what is, if anything, lacking in these solutions. Thanks in advance. -----Original Message----- From: bioclusters-admin@bioinformatics.org [mailto:bioclusters-admin@bioinformatics.org] On Behalf Of bioclusters-request@bioinformatics.org Sent: Sunday, May 09, 2004 11:01 AM To: bioclusters@bioinformatics.org Subject: Bioclusters digest, Vol 1 #482 - 1 msg When replying, PLEASE edit your Subject line so it is more specific than "Re: Bioclusters digest, Vol..." And, PLEASE delete any unrelated text from the body. Today's Topics: 1. Re: MPI clustalw (Guy Coates) --__--__-- Message: 1 Date: Sun, 9 May 2004 11:17:16 +0100 (BST) From: Guy Coates <gmpc@sanger.ac.uk> To: bioclusters@bioinformatics.org Subject: [Bioclusters] Re: MPI clustalw Reply-To: bioclusters@bioinformatics.org > example web servers and services where you need rapid response for > single, or small numbers of jobs. We (well, the ensembl-ites) do run a small amount of mpi-clustalw. The algorithm scales OK for small alignment (but they run quickly, so why bother?) but is horrible for large alignments. These are figures for an alignment of a set of 9658 sequences, running on Dual 2.8GHz PIV machines with gigabit. Ncpus Runtime Efficiency ---- ------- ----------- 2 28:21:33 1 4 19:49:05 0.72 8 14:49:02 0.48 10 14:09:41 0.4 16 13:37:36 0.26 24 13:00:30 0.18 32 12:48:39 0.14 48 12:48:39 0.09 64 11:19:40 0.08 96 11:30:09 0.05 128 11:13:28 0.04 However, although the scaling is horrible, it does at least bring the runtime down to something more manageable. MPI clustalw only gets run for the alignments that the single CPU version chokes on. It may not be pretty, but at least you do get an answer, eventually. Horses for courses and all that. > > Guy/Tim - did you ever deploy that HMMer PVM cluster we talked about > for the Pfam web site? > It's on the ever-expanding list of things to do. So, does anyone here have any opinions/experience on the PVM verison of HMMer? Guy -- Guy Coates, Informatics System Group The Wellcome Trust Sanger Institute, Hinxton, Cambridge, CB10 1SA, UK Tel: +44 (0)1223 834244 ex 7199 --__--__-- _______________________________________________ Bioclusters maillist - Bioclusters@bioinformatics.org https://bioinformatics.org/mailman/listinfo/bioclusters End of Bioclusters Digest