I agree with jfreeman that this howto is a good place to start, but you may not want to bother with RedHat 5.2. I set a personal speed record building out a small (8 node) cluster last week using RedHat 7.2. I used the kickstart gui to configure bootdisks for the slave nodes. This is an embarrassingly parallel blast cluster (NFS, postgres, NCBI blastall, rexec/rsh, and perl). Performance hint: What you really want to do with this kind of cluster is to have a good enough local RAM to refdb ratio to prevent disk I/O churning. If you can run a whole batch with only an initial read, then the next bottleneck will be the CPU/BUS speed, which is a fairly high bar. I haven't challenged the performance on this little cluster, but my work cluster (18 nodes 2GB RAM/node) cuts through >1500 queries/minute against nr. In addition to BLAST, this type of system is also ideal for standalone InterPro. I split the reference databases with this (babyperl freebee :-) ): #!/usr/bin/perl # # refdb_splitter.pl - Splits a ref fasta db into $N gzipped chunks for distribution to cluster # # Usage: zcat ref_fasta_db(.Z or .gz) | CMGD_splitter.pl # $N = 8; # your number of nodes here (or node number itself if you want to run an iteration of this script on each node... parallelize the splitter) $fasta=""; $i=0; $split = 1; while ($line =<STDIN>) { if (grep (/^>/, $line)){$i++;} if ($i == 2){ if ($split > $N){$split = 1;} # or "if ($split % $N == 0){" for running at each node open (PIPE, "|gzip >>ref_fasta_db_$split.gz"); print PIPE $fasta; close PIPE; $fasta = ""; $split++; $i = 1; #} #decomment for parallel version } $fasta = "$fasta"."$line"; } -- Eric Engelhard - www.cvbig.org - www.sagresdiscovery.com jfreeman wrote: > > Start Here... > http://www.beowulf-underground.org/doc_project/BIAA-HOWTO/Beowulf-Installation-and-Administration-HOWTO-5.html > > Once you have a small 2 node master/slave cluster running with the slave > node running starting through tftpboot you are ready for the next level > of complexity... > > Danny Navarro wrote: > > > > Hi all, > > > > I would like to set up a linux cluster with some pcs to run blast > > searches against EST human database. First I will try to blast locally > > in the master node but I would like also to make a blast server > > available to the intranet. > > > > I have to learn a lot about linux clusters but now I don't know exactly > > how to start to do this, shall I use beowulf or mosix or there are other > > better alternatives? What do you think is the best system for doing that > > task? > > > > Thanks > > > > _______________________________________________ > > Bioclusters maillist - Bioclusters@bioinformatics.org > > http://bioinformatics.org/mailman/listinfo/bioclusters > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > http://bioinformatics.org/mailman/listinfo/bioclusters