Check out Rocks (http://www.rocksclusters.org). IMHO it is much better than FAI and SIS. It also includes SGE. On Jan 20, 2005, at 3:47 PM, Speakman, John H./Epidemiology-Biostatistics wrote: > Hello > > If anyone can review the below and suggest a way to go, or even better > something I have gotten completely wrong, it would be much > appreciated! > > Thanks > John > > Hardware: > > Ten HP Proliant nodes, one DL380 and nine DL140. Each node has two > 3.2Ghz Xeon processors. They do not have a dedicated switch; the > infrastructure folks say they want to implement this using a VLAN. We > have some performance concerns here but have agreed to give it a try. > > User characteristics: > > The users are biostatisticians who typically program in R; they often > use plug-in R modules like bioconductor. They always want the newest > version of R right away. Also they may also write programs in C or > Fortran. Data files are usually small. Nothing fancy like BLAST, > etc. > > User concerns: > > Users require a Linux clustering environment which enables them to > interact with the cluster as though it were a single system (via ssh > or X) but which will distribute compute-intensive jobs across nodes. > As the code is by and large not multithreaded, it is expected that > each job will be farmed out to an idle compute node and probably stay > there until it is done. That’s fine. In other words, to use all > twenty CPUs we will need twenty concurrent jobs. > > Administration concerns: > > The cluster must require the absolute minimum of configuration and > maintenance, because I’ve got to do it and I’m hardly ever around > these days. > > Other concerns: > > Users and administrators alike have a preference for Debian Linux over > other distributions. Users also have an aversion to non-free > software. Either or both of these considerations could be overridden > if the reasons were pressing. > > Cluster software requirements: > > (1) The cluster must have a mean of deploying Linux to the nodes > and keeping their configurations (including updates to the operating > system and applications, lists of users, printers, etc.) in > synchronization. > (2) The cluster must have a means of transparently distributing > jobs to idle CPUs. It’s not necessarily to actively rebalance this > when a job has started – it’s okay if, once tied to a node, it stays > there. > > Potential solutions: > > We like the look of NPACI Rocks but its non-Debian-ness makes it a > last resort only. What we would really like to try is a Debian > version of NPACI Rocks; in its absence we will probably have to use > two separate packages to fulfil the requirements of #1 and #2 above. > > Sensible options for #1 seem to be: > (1) SystemImager (www.systemimager.org) > (2) FAI (http://www.informatik.uni-koeln.de/fai/), maybe also > involving the use of cfengine2 (http://www.iu.hio.no/cfengine/) > > SystemImager is the better-established product and looks to be simpler > to set up than FAI and/or cfengine2, in both of which the learning > curve looks steep. However, FAI seems more elegant and more like the > idea of “NPACI Rocks Debian” that we’re looking for, implying that > once set up FAI/cfengine2 will require less ongoing maintenance. > > Sensible options for #2 seem to be: > > (1) OpenMosix > (2) OpenPBS > (3) Sun GridEngine N1 > > Note: all of the above have commercial versions; we’d be reluctant to > consider them unless it means big savings in administration time and > effort. We get the impression OpenMosix (and, to a lesser extent, > OpenPBS) have question marks over how much time and resources the > people maintaining these products have, suggesting bugs, instability > and not keeping up with kernel/library updates, etc. Sun GridEngine > seems more robust but does not seem to have a big Debian user base. > > What do you all should we try first? > > Thanks! > John > > > > John Speakman > > Manager, Clinical Research Systems > > Memorial Sloan-Kettering Cancer Center > > 307 East 63rd Street, New York NY 10021 USA > > +1 646 735 8187 - SpeakmaJ at mskcc.org > > > > > ===================================================================== > > Please note that this e-mail and any files transmitted with it may be > privileged, confidential, and protected from disclosure under > applicable law. If the reader of this message is not the intended > recipient, or an employee or agent responsible for delivering this > message to the intended recipient, you are hereby notified that any > reading, dissemination, distribution, copying, or other use of this > communication or any of its attachments is strictly prohibited. If > you have received this communication in error, please notify the > sender immediately by replying to this message and deleting this > message, any attachments, and all copies and backups from your > computer. > > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters > Glen Otero Ph.D. Linux Prophet -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/enriched Size: 12804 bytes Desc: not available Url : http://bioinformatics.org/pipermail/bioclusters/attachments/20050120/5021c8a6/attachment.bin