Unsubscribe -----Original Message----- From: bioclusters-admin@bioinformatics.org [mailto:bioclusters-admin@bioinformatics.org]On Behalf Of bioclusters-request@bioinformatics.org Sent: Friday, March 26, 2004 12:01 PM To: bioclusters@bioinformatics.org Subject: Bioclusters digest, Vol 1 #459 - 11 msgs When replying, PLEASE edit your Subject line so it is more specific than "Re: Bioclusters digest, Vol..." And, PLEASE delete any unrelated text from the body. Today's Topics: 1. Re: Best ways to tackle migration to dedicated cluster/Farm (Malay) 2. Re: Any issues porting applications to OS X? (Malay) 3. Second International Conference "Genomics, Proteomics and Bioinformatics for Medicine" (Poroikov V.V.) 4. Re: Best ways to tackle migration to dedicated cluster/Farm (Chris Dwan (CCGB)) 5. Re: Any issues porting applications to OS X? (Chris Dwan (CCGB)) 6. FW: [Globus-discuss] CSF project on SourceForge (Christopher Smith) 7. Re: Any issues porting applications to OS X? (Dr Malay K Basu) 8. Re: Best ways to tackle migration to dedicated cluster/Farm (Ross Crowhurst) 9. Re: Re: Best ways to tackle migration to dedicated cluster/Farm (Chris Dwan (CCGB)) --__--__-- Message: 1 Date: Thu, 25 Mar 2004 12:16:17 -0500 From: Malay <mbasu@mail.nih.gov> To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] Best ways to tackle migration to dedicated cluster/Farm Reply-To: bioclusters@bioinformatics.org Farul Mohd. Ghazali wrote: > On Wed, 24 Mar 2004, Chris Dwan (CCGB) wrote: > > >>>Also, our existing farm uses "node pull"... >> >>This is very cool. >> >>I've heard stories about folks with big clusters using a "pull" based >>system like Condor to backfill empty cycles on their scheduled cluster. > > > Excuse my ignorance, I've never used Condor but what's a pull based > system? What would be the advantages over PBS/SGE (which I assume are push > type systems)? > > TIA > > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters The present BLAST queing system in NCBI uses node pulling using MSSQL as database repository and a custom made daemon in each node to pull jobs from the database. The way it works seems node pulling is faster than using NFS. Most of the time NFS becomes a bottle-neck for distributing jobs. Node pulling really works great because of the database is highly efficient to serve request to large number of nodes. Malay mbasu(at)ncbi.nlm.nih.gov --__--__-- Message: 2 Date: Thu, 25 Mar 2004 12:32:36 -0500 From: Malay <mbasu@mail.nih.gov> To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] Any issues porting applications to OS X? Reply-To: bioclusters@bioinformatics.org Susan Chacko wrote: > On Mar 24, 2004, at 8:05 PM, LAI Loong Fong wrote: > >> On the topic on low latency interconnects, how many of you are running >> your >> clusters with such interconnets? Especially in the area of >> bioinformatics. >> We are using quadrics on one IA32 cluster and Myrinet on IA32, IA64 >> and G5. >> With occasional mpi jobs that uses these interconnects, I do not see >> other >> jobs requiring it. From what I have seen so far, expensive interconnects are not really necessary of a large portion of bioinformatics work. Gigabit on copper is more than enough for most type of work. We use gigabit on copper for connecting execution nodes to the switch and gigabit on fiber for connecting NAS to the switch. It saves lots of money and works great. Malay mbasu(at)ncbi.nlm.nih.gov --__--__-- Message: 3 Date: Thu, 25 Mar 2004 19:28:02 +0300 From: "Poroikov V.V." <vvp@ibmh.msk.su> Organization: Institute of Biomedical Chemistry Rus. Acad. Med. Sci. To: bioclusters@bioinformatics.org Subject: [Bioclusters] Second International Conference "Genomics, Proteomics and Bioinformatics for Medicine" Reply-To: bioclusters@bioinformatics.org Dear Webmaster, If I could ask you to send the information about GPBM-2004 via your Bioinformatics Mailing List. Thank you in advance. Yours sincerely, Vladimir Poroikov, Prof. Dr. ------------------------------------------------------------------- Dear Colleagues, Second International Conference "Genomics, Proteomics and Bioinformatics for Medicine" (GPBM-2004) will be held on July 14-19, 2004. The Conference will take place on a board of the comfortable vessel going along the Russian rivers and canals from Moscow to Ples and back. Such mode of the scientific meeting provide several advantages to the participants: (1) High scientific level & multidisciplinary programme due to the participation of eminent scientists as invited speakers; (2) Oral talks by all participants whose abstracts will be accepted; (3) Participants and accompanying persons will merge scientific and touristic issues visiting beautiful places of Russian nature and history (3-4 hours excursions every day); (4) Participants have a possibility for communication & discussions for 24 hours per day. The deadline for abstract submission to the Second International Conference "Genomics, Proteomics and Bioinformatics for Medicine" is April 1, 2004. If you are interested to take part in the Conference, it would be nice if you send your abstract and filled registration form to the Organizing Committee as soon as possible. This will give us an opportunity to send you the confirmation of the acceptance no later than April 15, 2004. Since the number of rooms at the ship is limited, early birds will get an advantage (deadline for the reduced fees is May 1, 2004). More details are available on the web site: http://www.ibmh.msk.su/gpbm2004/english.htm Feel free to forward this information to any of colleagues who might be interesting to take part in GPBM-2004 conference. If you will have any question, please, do not hesitate to contact me. Looking forward to hear from you soon. Yours sincerely, Vladimir Poroikov, Prof. Dr. Chairman of the Local Committee, GPBM-2004 Deputy Director (Research) Institute of Biomedical Chemistry of Rus. Acad. Ned. Sci. ------------------------------------------------------------------- --__--__-- Message: 4 Date: Thu, 25 Mar 2004 11:37:03 -0600 (CST) From: "Chris Dwan (CCGB)" <cdwan@mail.ahc.umn.edu> To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] Best ways to tackle migration to dedicated cluster/Farm Reply-To: bioclusters@bioinformatics.org > Excuse my ignorance, I've never used Condor but what's a pull based > system? What would be the advantages over PBS/SGE (which I assume are push > type systems)? In a "push" system, a central scheduler assigns work to the compute nodes. Nodes sit idle until work is assigned to them. In a "pull" system, each node is responsible for deciding when to make itself available for work, for requesting an appropriate job, and for keeping track of the administrative details associated with the job. In my experience, push systems (including all centrally scheduled clusters using PBS, SGE, NQE, LSF, and the like) are easier to debug, since you only have to deal with one scheduler. They are sometimes less efficient, since if the scheduler is too busy to assign work to a node (or doesn't know that the node exists, or is dumb about planning use of resources, or any of a number of other cases) then the node in question may go unused. Pull systems (including SETI@home and friends, as well as Platform's grid offering) are best suited for cycle stealing and ad-hoc clustering. They can be really troublesome to debug, since what gets lost in the case of errors is usually job state, instead of cycles on nodes. Instead of the cluster operating at less than peak efficiency, it loses jobs. This can be frustrating for both users and admins. Condor (at least when I used it a year or so ago) is really a hybrid system. There is a central "matchmaker" which is responsible for pairing "classads" from compute resources with jobs needing to be done. Condor's biggest strength is the thinking that Myron Livney and his group put into the social side of distributed computing. Condor has great facilities for managing the 'owner comfort' problem: Grid computing is great, but only if I get first crack at machines that I own, my competitors never get to use them, and I get to have my machine back the instant I move my mouse. -Chris Dwan --__--__-- Message: 5 Date: Thu, 25 Mar 2004 11:44:49 -0600 (CST) From: "Chris Dwan (CCGB)" <cdwan@mail.ahc.umn.edu> To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] Any issues porting applications to OS X? Reply-To: bioclusters@bioinformatics.org > >> On the topic on low latency interconnects, how many of you are running > >> your > >> clusters with such interconnets? Especially in the area of > >> bioinformatics. > >> We are using quadrics on one IA32 cluster and Myrinet on IA32, IA64 > >> and G5. > >> With occasional mpi jobs that uses these interconnects, I do not see > >> other > >> jobs requiring it. > > From what I have seen so far, expensive interconnects are not really > necessary of a large portion of bioinformatics work. Gigabit on copper > is more than enough for most type of work. We use gigabit on copper for > connecting execution nodes to the switch and gigabit on fiber for > connecting NAS to the switch. It saves lots of money and works great. The question is one of latency. If your data motion tends to come in large bursts (get the blast target, write the output) then a little delay at the beginning of each transmission won't hurt too much. Gigabit is fine. Heck, some poor university folks get by mostly with 100 base-T. :) If, on the other hand, your process needs to communicate small amounts of data, frequently (this particle moved out of my cell in the simulation, someone else better pick it up and send me an "ack") those litle hits for setting up a transmission will really add up. For the sequence based work with which I usually help out, I would almost always rather have the additional nodes. That opinion will change as soon as I have to deal with any truly parallel, message passing jobs. -Chris Dwan --__--__-- Message: 6 Date: Thu, 25 Mar 2004 09:49:28 -0800 From: Christopher Smith <csmith@platform.com> To: <beowulf@beowulf.org>, <bioclusters@bioinformatics.org> Subject: [Bioclusters] FW: [Globus-discuss] CSF project on SourceForge Reply-To: bioclusters@bioinformatics.org FYI ... might be of interest to those tying multiple clusters together. -- Chris ------ Forwarded Message From: Christopher Smith <csmith@platform.com> Date: Wed, 24 Mar 2004 11:42:42 -0800 To: <discuss@globus.org> Subject: [Globus-discuss] CSF project on SourceForge I'd like to announce a new SourceForge site for the Community Scheduler Framework. The project description: "The Community Scheduler Framework (CSF) is a set of Grid Services, implemented using the Globus Toolkit 3.x, which provides an environment for the development of metaschedulers that can dispatch jobs to resource managers such as LSF, SGE or PBS." The project site is at http://sourceforge.net/projects/gcsf Currently the site is a little thin on documentation and web pages, etc, but most importantly there is a developer's mailing list (gcsf-devel) and the current source code for CSF in the project CVS repository. The source code is the same as that released by Platform last December/January, with the following changes: - RM adapter code (factory and base classes) is now available in source form. All Platform proprietary stuff is not part of the code. - An RM adapter which can send jobs to GT2 GRAM I'd encourage anybody who is interested in writing grid or meta schedulers to take a look and get involved. -- Chris - To Unsubscribe: send mail to majordomo@globus.org with "unsubscribe discuss" in the body of the message ------ End of Forwarded Message --__--__-- Message: 7 Subject: Re: [Bioclusters] Any issues porting applications to OS X? From: Dr Malay K Basu <mbasu@mail.nih.gov> To: bioclusters@bioinformatics.org Date: Thu, 25 Mar 2004 14:10:04 -0500 Reply-To: bioclusters@bioinformatics.org On Thu, 2004-03-25 at 12:44, Chris Dwan (CCGB) wrote: > > The question is one of latency. If your data motion tends to come in > large bursts (get the blast target, write the output) then a little delay > at the beginning of each transmission won't hurt too much. Gigabit is > fine. Heck, some poor university folks get by mostly with 100 base-T. :) > > If, on the other hand, your process needs to communicate small amounts of > data, frequently (this particle moved out of my cell in the simulation, > someone else better pick it up and send me an "ack") those litle hits for > setting up a transmission will really add up. > > For the sequence based work with which I usually help out, I would almost > always rather have the additional nodes. That opinion will change as soon > as I have to deal with any truly parallel, message passing jobs. > Question is it really necessary. Sure it will give you higher position in top 500 list, but the price/performance ration IMHO is not worth going in that direction. As someone already pointed out SGI Altrix is cheaper than the home-brewed low-latency interconnect. For bioinformatics is sure copper/fiber base networking is more than enough (for sequence analysis it goes without saying). Real-time analysis ofcourse will benefit from such low-latency interconnect. And copper/fiber based networking has come a long way. Malay --__--__-- Message: 8 Date: Fri, 26 Mar 2004 10:15:32 +1200 From: "Ross Crowhurst" <RCrowhurst@hortresearch.co.nz> To: <bioclusters@bioinformatics.org> Subject: [Bioclusters] Re: Best ways to tackle migration to dedicated cluster/Farm Reply-To: bioclusters@bioinformatics.org > > >From: "Chris Dwan (CCGB)" <cdwan@mail.ahc.umn.edu> >Do you have a feeling for where this anti-cycle-stealing attitude comes >from? Like Chris Dagdigian said, it sounds like you've got benchmarking >well in hand, and there are a wide variety of examples of folks using >production workstations to augment their dedicated clusters. If the >concerns are reliability, security, performance impact, or other technical >things, those can be worked with numbers and tests. Chris Our system actually uses dual OS desktops, so its really an ad-hoc cluster/farm. The primary OS is WindowsXP which is used during normal working hours but is not actually used within our bioinformatics pipeline so we are not actually cycle-stealing much as I would like to do that. When staff leave at night they simply reboot there desk tops (Linux - RH8.0 is the default OS on these machines) so the staff member happily goes home and their machine boots to Linux, synchronises itself and joins our pipeline. So its not really cycle stealing just utilisation outside normal hours. Apologies if I did not make this clear initially. The key issues are therefire OS management issues - TOC (need for a second hard disk, time to install second OS, how to roll out updates for security patches etc). > >If, on the other hand, it's concern over trying something new, the Engine >system recently implemented at Novartis is a decent example of a >corporation gaining a great deal of horsepower this way. Can you provide more details on this or point me in the right direction to get more info please? >I will almost never get in the way of someone who wants to go to a >dedicated, cluster system if they need it to get their work done, and >they have the money to spend. A well thought out, centralized resource >will almost always be easier (and cheaper) to administer than a cycle >stealing solution. Agreed. >Pull systems (including SETI@home and friends, as well as Platform's grid >offering) are best suited for cycle stealing and ad-hoc clustering. They >can be really troublesome to debug, since what gets lost in the case of >errors is usually job state, instead of cycles on nodes. Instead >of the cluster operating at less than peak efficiency, it loses jobs. >This can be frustrating for both users and admins. Lost jobs are not really an issue for us. We have processes running to reap jobs that have not completed and reset their availability status. Most of our runs are of durations that jobs lost on nodes that "leave" the pipeline can be reaped and reassigned without holding up the overall pipeline unduly. I have reapers that check the online status of nodes hourly (could be done more frequently) and reset there status where the node has left the farm. Additionally there are a reapers that run based on expected times for completion of different job types and based on the number of jobs left to do in that batch (time between checks decreases as number of jobs decrease) so we tend to catch lost jobs. Efficiency in a pull system does however decrease at the end of a batch run where chunks of jobs are grabbed by the nodes and you have no control over which nodes grab the jobs. If the last few chucks are grabbed by the slowest nodes then the pipeline will have to wait for these to complete before it moves on. To get around this we run multiple pipelines so the nodes are pretty much working all the time processing. We only really have a "single user" in our pipeline which is the automated pipeline control process itself. I appreciate your comments about frustratiosn for users and admins with "pull" approaches where you have multiple users - they definitely could experience delays. ______________________________________________________ The contents of this e-mail are privileged and/or confidential to the named recipient and are not to be used by any other person and/or organisation. If you have received this e-mail in error, please notify the sender and delete all material pertaining to this e-mail. ______________________________________________________ --__--__-- Message: 9 Date: Thu, 25 Mar 2004 16:39:08 -0600 (CST) From: "Chris Dwan (CCGB)" <cdwan@mail.ahc.umn.edu> To: bioclusters@bioinformatics.org Subject: Re: [Bioclusters] Re: Best ways to tackle migration to dedicated cluster/Farm Reply-To: bioclusters@bioinformatics.org I like the dual boot solution too. There are a lot of very hairy issues involved in cycle stealing...especially if the machines crash more frequently with your software than without. Desktop support people notice things like that (as well they should!). The TCO argument seems silly to me when based on hardware costs. Seems to me that you need to buy the disk anyway, it's whether or not you have to buy a computer to go with it. :) The real savings from moving to a dedicated cluster is on the admin / support end. In terms of OS coherency, there are folks on this list with great experience in that area. The short form is that you're going to need to address the issue of pushing out updates to compute nodes whether they're dedicated to a cluster or not. The answer is to do it at boot time, from a boot image server. > > If, on the other hand, it's concern over trying something new, the > > Engine system recently implemented at Novartis is a decent example of > > a corporation gaining a great deal of horsepower this way. > > Can you provide more details on this or point me in the right direction > to get more info please? United Devices (http://www.ud.com) has a commercial implementation of cycle stealing software a la Seti@home. Their page has a number of good links. Included there is a PDF describing a major install at Novartis. Basically, Novartis has already plugged 2,700 workstations into their UD grid, and they plan to include 62,000 more as they finish standardizing every single workstation across all their sites. It's a cool project by any measure, but (of course) the devil is in the details. -Chris Dwan --__--__-- _______________________________________________ Bioclusters maillist - Bioclusters@bioinformatics.org https://bioinformatics.org/mailman/listinfo/bioclusters End of Bioclusters Digest