[Bioclusters] RE: non linear scale-up issues?

Jeffrey B. Layton bioclusters@bioinformatics.org
Mon, 10 May 2004 21:01:15 -0400


David Gayler wrote:

>Thanks for the feedback,
>
>I am aware of United Devices. They seem to have a very good solution and it
>is multi-platform too. 
>

Is it? I thought the "client" that runs the actual application was
Windows only (Grid MP Workstation). I think the management
software can run on other OS's, but I'm not sure (it's been a year
or so since I last talked to them).

>I am interested in any pain points you have with
>their or similar technology? Are there features or functionality you would
>like to see that are missing, specifically geared towards bioinformatics
>research?
>
>If I understand you, you are saying that this is a good example of where
>grid technology is delivering on the promise. It is pretty clear that when
>the problem is 'embarrassingly parallel', enterprise grid (cycle stealing)
>solutions (like UD's) can be an intelligent way to use one's current IT
>infrastructure investment. I guess what I am looking for is what could be
>made better.
>
>Also, what do you mean by standards folks? Are you talking about Globus
>Toolkit and its ever evolving architecture (now mainly web-service based)?
>What do you see as the threat with this?
> 
>IMHO, MPI is just fine for clusters with good links where security is of no
>or little concern, however it really isn't made for loosely coupled networks
>and cycle stealing scenarios. It certainly never has had security in mind.
>These are the kind of things that, if not baked into the technology, can
>turn your Grid nodes into a security risk and ultimately a bunch of zombies
>waiting to be used for a DDOS attack or worse. Once that happens, trying to
>trust or even keep your Grid could be a tough political battle.
>

   Well, the idea of using "cycle stealing" with our MPI
codes works pretty well. We're behind a nice nasty firewall
and the machines are all within the firewall. Also, if
someone sniffs the network traffic they won't get much -
at least not enough to put everything together. The only I'm
interested in testing is the effect of router hops on the MPI
latency and the impact on the run time of the code. I'm
willing to accept some degradation in speed to get access to
the "free" machines on the network, and I'm willing to accept
some scaling problems. However, I'm hoping the impact isn't
too severe.
   Anyway, I think UD's software is pretty neat. There are some
competitors out there such as Cycles@Work from MPI Software
Technologies.

Good Luck!

Jeff

>
>
>
>
>On Sun, 9 May 2004, Rayson Ho wrote:
>
>  
>
>>UD: http://www.grid.org/stats/
>>
>>325,033 years of CPU time collected
>>    
>>
>
>he he :-)  Rayson knows his stuff, as do United Devices.  You will not see
>mpi any where near this 300k+ CPU years.  Good point.
>
>  
>
>>BOINC: http://boinc.berkeley.edu
>>    
>>
>
>This looks great.  Classic 'grid hype' this certainly is not.  Good stuff.  
>Thanks for sending on the link, I really hope that the standards folk keep
>the hell away from this.  If they do it may have a real chance...
>
>Best regards,
>
>J.
>
>--
>James Cuff, D. Phil.
>Group Leader, Applied Production Systems
>The Broad Institute. 320 Charles Street, Cambridge,
>MA. 02141-2023.  Tel: 617-252-1925  Fax: 617-258-0903
>
>-----Original Message-----
>From: bioclusters-admin@bioinformatics.org
>[mailto:bioclusters-admin@bioinformatics.org] On Behalf Of
>bioclusters-request@bioinformatics.org
>Sent: Monday, May 10, 2004 11:01 AM
>To: bioclusters@bioinformatics.org
>Subject: Bioclusters digest, Vol 1 #483 - 4 msgs
>
>When replying, PLEASE edit your Subject line so it is more specific
>than "Re: Bioclusters digest, Vol..."  And, PLEASE delete any
>unrelated text from the body.
>
>
>Today's Topics:
>
>   1. non linear scale-up issues? (David Gayler)
>   2. Re: MPI clustalw (Tim Cutts)
>   3. Re: non linear scale-up issues? (Rayson Ho)
>   4. Re: non linear scale-up issues? (James Cuff)
>
>--__--__--
>
>Message: 1
>From: "David Gayler" <dag_project@sbcglobal.net>
>To: <bioclusters@bioinformatics.org>
>Date: Sun, 9 May 2004 12:22:59 -0500
>Subject: [Bioclusters] non linear scale-up issues?
>Reply-To: bioclusters@bioinformatics.org
>
>Hi,
>I am a newbie and not a bio guy, but rather a computer science guy.
>I am interested in why you are seeing these kind of horrible non-linear
>Scale-out numbers. Please excuse my ignorance, but is the problem that there
>are dependencies between jobs and this doesn't scale? What exactly is the
>relationship of this problem to MPI? I just have to wonder if figuring out a
>better way to divide and conquer has any merit. I am interested in y'alls
>feedback as my company is working on a Windows .NET based Grid solution and
>we want to focus on the bioinformatics community. It seems to me that a lot
>of researchers spend time worrying about getting faster results
>(understandably), however it doesn't seem like there is much in the way of
>cycle-stealing grid software solutions that are flexible, secure, and easy
>to use. I want to know what is missing currently to get these faster results
>reliably, despite hardware faults, etc.
>
>I am aware of Condor (free), DataSynapse, Platform Computing, and others. I
>am interested in knowing what is, if anything, lacking in these solutions.
> 
>Thanks in advance.
>
>-----Original Message-----
>From: bioclusters-admin@bioinformatics.org
>[mailto:bioclusters-admin@bioinformatics.org] On Behalf Of
>bioclusters-request@bioinformatics.org
>Sent: Sunday, May 09, 2004 11:01 AM
>To: bioclusters@bioinformatics.org
>Subject: Bioclusters digest, Vol 1 #482 - 1 msg
>
>When replying, PLEASE edit your Subject line so it is more specific
>than "Re: Bioclusters digest, Vol..."  And, PLEASE delete any
>unrelated text from the body.
>
>
>Today's Topics:
>
>   1. Re: MPI clustalw (Guy Coates)
>
>-- __--__-- 
>
>Message: 1
>Date: Sun, 9 May 2004 11:17:16 +0100 (BST)
>From: Guy Coates <gmpc@sanger.ac.uk>
>To: bioclusters@bioinformatics.org
>Subject: [Bioclusters] Re: MPI clustalw
>Reply-To: bioclusters@bioinformatics.org
>
>  
>
>>example web servers and services where you need rapid response for
>>single, or small numbers of jobs.
>>    
>>
>
>We (well, the ensembl-ites) do run a small amount of mpi-clustalw. The
>algorithm scales OK for small alignment (but they run quickly, so why
>bother?) but is horrible for large alignments.
>
>These are figures for an alignment of a set of  9658 sequences, running on
>Dual 2.8GHz PIV  machines with gigabit.
>
>Ncpus 	Runtime 	Efficiency
>----  	------- 	-----------
>2 	28:21:33	1
>4   	19:49:05	0.72
>8 	14:49:02	0.48
>10  	14:09:41	0.4
>16  	13:37:36	0.26
>24  	13:00:30	0.18
>32  	12:48:39	0.14
>48  	12:48:39	0.09
>64  	11:19:40	0.08
>96  	11:30:09	0.05
>128 	11:13:28	0.04
>
>However, although the scaling is horrible, it does at least bring the
>runtime down to something more manageable. MPI clustalw only gets run for
>the alignments that the single CPU version chokes on. It may not be
>pretty, but at least you do get an answer, eventually. Horses for courses
>and all that.
>
>
>  
>
>>Guy/Tim - did you ever deploy that HMMer PVM cluster we talked about
>>for the Pfam web site?
>>
>>    
>>
>
>It's on the ever-expanding list of things to do. So, does anyone here have
>any opinions/experience  on the PVM verison of HMMer?
>
>
>Guy
>  
>