> xpscreens wrote: > > Hi, I just read about this in Wired and I am quite interested. Could > you share any information to get a newbie started on setting up a > cluster for playing around with this, or give me a link to something > useful? Thanks I have not yet seen the Wired article, but I think I was among those who plugged the bioclusters list. If this is the first of many general interest questions, then perhaps a short bioclusters FAQ and/or a "clustering BLAST at home mini-how-to" are in order. It is, however, important to point out that most hobbyist needs can be met by either using publicly available services or running free software on a single workstation (see http://bioinformatics.org/software/index.php3 and http://www.cvbig.org/tools/). Depending on your Linux skills and familiarity with bioinformatics, you may want to start with the O'Reilly book "Developing Bioinformatics Computer Skills" by Gibas and Jambeck. Clustering itself is a specialized subset of skills including hardware, system administration, programming and familiarity with biological goals. I am a biologist by training and have relatively little experience with high performance computing as compared with others on this list. That said, I've built out one small cluster at work (currently 15 nodes) and three tiny clusters (4-8 nodes) at mine and other people's homes. These were all of the "embarrassingly parallel" variety for batching NCBI BLAST and/or InterPro. The home systems turned out to be decent hobbyist tools and are as simple as they come: private 100 Mb network Master node: two NICs, NFS, DHCP, NCBI BLAST, Perl wrappers Slaves: open to rexec, NCBI BLAST, Perl wrappers The reference databases are equally divided among the nodes, but the queries and results are stored on the master node (either flat or in a database). Each node runs and independent instance of BLAST against each query and parses locally to keep net traffic down (your needs may vary). I like using PostgreSQL as a relational database management system, but many hobbyists will be satisfied with flat files. I've also used Apache with PHP and Samba to allow for browser access and Windows file system access, respectively, but prefer custom Perl scripts and the command line for batching and parsing my own projects. -- Eric Engelhard - www.cvbig.org - www.sagresdiscovery.com