Hi Rick, Without knowing more details I can't say for sure if you are going to be well served spending $40K on a biocluster. Where you spend your money is largely dependent on your priorities and what you want to do. For instance you may get good results by taking some of that budget money and piling lots more RAM into your existing Sun boxes. So- take this reply with the grain of salt it deserves... 40K will get you a nice flexible linux-on-commodity hardware 'biocluster' -- actual CPU count will largely depend on your packaging-- whitebox mini-tower cases will get you great price/performance but they take up tons of space and can be a hassle to wire and maintain. You will pay more for the bladed and rackmount systems but will gain floorspace and managbility. When I do this stuff professionally I tell people that a good budgetary guideline for cluster building blocks is roughly $1000 per cpu without a high speed interconnect subsystem (ie a dual CPU 1U rackmount server will generally cost about $2000 - $2800 depending on how it is kitted out). Nodes will cost more if from 'name' vendors like IBM or HP. You will need to pad on extra money for 'head nodes' and switches/fileserver/cables/disks etc. if necessary. If you are willing to take on some software work within your group you would get the most flexibility by purchasing just a bare bones cluster or compute farm configuration from one of the many companies who specialize in integrated cluster systems. Since few of them really specialize in the life sciences (or will charge you lots of $$ to ship a ready to rock biocluster) -- you will likely be better off getting just the system plus a hardware support contract from the vendor and then installing the load management layer and Blast/HMMER/etc. on your own. That way you can spend your budget on getting the most hardware you can afford. Many of the existing cluster hardware companies have been selling into the life science market for a while so you have a good chance of finding salesfolks who will actually understand what you mean when you say 'biocluster' or 'blast farm'. Since you have a limited budget I'd recommend the freely available Sun GridEngine suite for handling load, batch scheduling and remote job execution. There are several hardcore SGE users on this list by now and the SGE-users mailing list is active and a great place to get support. There are 2 cluster vendors that I personally like and can recommend -- Microway in Massachusetts (www.microway.com) and Rackable Systems in California (www.rackable.com). If you talk to either of them tell them that Chris from bioteam.net says hello :) I'm also using Dell hardware for a current project at local university and have had good experiences with the Poweredge 1550, 1650 and 6450 servers as well as their Powerconnect line of switches which are incredible (I can get a Dell switch for $900 that has more functionality than what I used to pay Cisco $3500 for...just amazing). You can probably get a nice full Dell branded setup (servers + switches + racks + service contract) from Dell at your current academic pricing as long as you were willing to do a bit of rack and stack work onsite. Some universities prefer to go that route both for pricing and IT support reasons. You can justify spending more money on your hardware if it means not upsetting your IT group and ensuring that they will take responsibility for the care and feeding of your systems. RLX systems are nice although my hands on knowledge of them is almost a year and a half out of date by now. You need to understand that any bladed system that uses laptop drives mounted on the blade is going to give you slower IO performance which will generally not be optimal for things like sequence similarity searching. Off the top of my head this is one way to spend $40K to get a 30-CPU compute farm -- this is a rough budget guide only and may not be accurate since prices change daily for the commodity stuff: o 15x dual-CPU 1U Pentium III rackmounts @ $2000 each (Total: $30,000) o 1 beefy "head" or "portal" node to run the cluster and provide NFS services: $6000 o 24 or 48 port network switch with at least 2 gigabit ports (portal will use 1): $1000 o Cluster rack: ~$1000 o Misc cables, GBIC modules, power distribution, KVM, etc. etc.: $2000 ===== Total: $40,000 Regards, Chris Rick Westerman wrote: > I've been reading the biocluster list for some time but since we > have been, mainly, satisfied with our setup I have not jumped in. Now > I have a question. > > Background: We have $40K of "end of year" money needing to be spent > soon; a single 3700 sequencer pumping out ~200 sequences a day; and a > pair of Sun E-450 (4 GB memory, 4 processor) servers providing > GCG/Emboss, database and local Blast support when needed. Many of our > current Blast searches are batched to NCBI but occasionally we run > searches against non-standard datasets. Such processing can take over > the Sun computers for a couple of days. > > What off-the-shelf biocluster would you recommend? I would prefer > not to build a system from scratch and would also prefer not to spend > too much time installing and maintaining Linux and Blast itself > although "rolling our own" on the software end is more feasible than > the hardware end. We would also want to run HMMer and perhaps some > other data-intensive software on the cluster. > > I have looked at RLX, RackSaver, and the Paracel offerings. Any > other leads? > > Thank you for any advice, > > -- Rick > > -- Chris Dagdigian, <dag@sonsorol.org> Life Science IT & Research Computing Office: 617-666-6454, Mobile: 617-877-5498, Fax: 425-699-0193 Web: http://bioteam.net PGP KeyID: 83D4310E Yahoo IM: craffi