Sorry 'bout the late post, my messages were being returned... There are several commercial parallel BLASTs out there: 1) Blackstone's PowerBLAST (part of PowerCloud), (Sorry if this sounds too commercial, but I work for Blackstone) PowerBLAST utilizes data parallelization techniques that automate the splitting of query databases into smaller chunks that are then spread out over the cluster nodes' local disks for querying. Querying smaller datasets in this way speeds up the process a lot. PowerBLAST also automates the merging of BLAST results and uses disk caching and scheduling techniques to speed up future queries of the same datasets. 2) TurboGenomics' TurboBLAST (more of a grid-like blast than a cluster BLAST), TurboBlast is Java based and an extension of the Linda technology TurboBlast breaks up the database and query into slices and distributes them over the nodes in a cluster and does the merge for you. 3) Paracel's BLAST Machine Paracel actually got inside BLAST and parallelized the code. Other than SGI, they are the only folks I know that have done this. They post impressive speed up numbers and the statistics should be the same as an unaltered BLAST query. ******* In the words of Bill Pearson (author of FASTA) taken from a post to the beowulf list in response to why there are no MPI or PVM parallelized versions of BLAST: I suspect that BLAST is not available for MPI/PVM because (1) it is too fast, and (2) there is not much demand for it. 95% of the time, BLAST is almost an in-memory grep (the other 5% of the time it is working on the things it is looking for). Sequence comparison is embarrassingly parallel, and very easily threaded. Distributing the sequence databases and collecting results has more overhead (there probably aren't many distributed grep programs either). FASTA is 5 - 10X slower than BLAST, and Smith-Waterman is another 5-20X slower than FASTA. Here, the communications overhead is low, and distributed systems work OK for FASTA, and great for Smith-Waterman (where the overhead fraction is very small). Of course, it is a lot easier to compile a threaded program, and just run it, than it is to install and configure the MPI or PVM environment and the programs to run in it. Bioinformatics software is often run by computer savvy biologists, not high-performance computing folks, and not having to install and configure PVM/MPI is a big advantage. The NCBI probably does not make a PVM/MPI parallel BLAST because there is very little demand for it, and it does not meet their computational needs. ********* Hope that helps. Glen -- Glen Otero, Ph.D. Senior Life Science Consultant Blackstone Computing Phone:619.917.1772 -- Glen Otero, Ph.D. Senior Life Science Consultant Blackstone Computing Phone:619.917.1772