Hi, this is a bit of an unrelated question. I am a cmpt sci major who developed a multi-threaded version of clustalw 1.82 - it runs on multi-cpu anything that runs a posix compliant unix. It can be downloaded in source code from http://bioinfo.pbi.nrc.ca/clustalw-smp/ It would not be difficult at all to make it cluster-based, a colleague of mine did it within a matter of few days for play and testing (not in releasable form though), however, to this day I havent got a clue as to how widely used is clustalw among bio/bioinformatics community? Would it be of _any_ interest to make it cluster based? If so, I just might decide to spend that extra time and go that extra mile... :) Thanks, Ognen On Mon, 2002-05-13 at 08:12, Eric Engelhard wrote: > I have not yet seen the Wired article, but I think I was among those who > plugged the bioclusters list. If this is the first of many general > interest questions, then perhaps a short bioclusters FAQ and/or a > "clustering BLAST at home mini-how-to" are in order. It is, however, > important to point out that most hobbyist needs can be met by either > using publicly available services or running free software on a single > workstation (see http://bioinformatics.org/software/index.php3 and > http://www.cvbig.org/tools/). Depending on your Linux skills and > familiarity with bioinformatics, you may want to start with the O'Reilly > book "Developing Bioinformatics Computer Skills" by Gibas and Jambeck. > Clustering itself is a specialized subset of skills including hardware, > system administration, programming and familiarity with biological > goals. > > I am a biologist by training and have relatively little experience with > high performance computing as compared with others on this list. That > said, I've built out one small cluster at work (currently 15 nodes) and > three tiny clusters (4-8 nodes) at mine and other people's homes. These > were all of the "embarrassingly parallel" variety for batching NCBI > BLAST and/or InterPro. The home systems turned out to be decent hobbyist > tools and are as simple as they come: > > private 100 Mb network > Master node: two NICs, NFS, DHCP, NCBI BLAST, Perl wrappers > Slaves: open to rexec, NCBI BLAST, Perl wrappers > > The reference databases are equally divided among the nodes, but the > queries and results are stored on the master node (either flat or in a > database). Each node runs and independent instance of BLAST against each > query and parses locally to keep net traffic down (your needs may vary). > > I like using PostgreSQL as a relational database management system, but > many hobbyists will be satisfied with flat files. I've also used Apache > with PHP and Samba to allow for browser access and Windows file system > access, respectively, but prefer custom Perl scripts and the command > line for batching and parsing my own projects. > > -- > Eric Engelhard - www.cvbig.org - www.sagresdiscovery.com