[Bioclusters] Parallel Sequence Alignment tool

Mon Aug 31 10:18:45 EDT 2009

Ognen Duzlevski wrote:
> It would have been nice to be aware of this paper when I parallelized 
> ClustalW back in 2001/2. The program is not that complicated to 
> parallelize - it is your basic search for what takes up most time and 

As I remember, Haruna and Dmitri were building something called 
HT-Clustal around that time, using shared memory rather than a 
cluster-ized version.  We had previously done the cluster-ized blast 
(CT-BLAST aka SGI GenomeCluster).

> how much that portion lends itself to being parallelized. I don't 
> remember that well - it has been a while - but if I remember correctly 
> ClustalW had three phases - 1st one was sequence-to-sequence alignment 
> which was very easy to parallelize, the second phase was irrelevant 
> time-wise to consider and the third one was where significant time was 
> spent but it was more difficult to parallelize...

The HT-Clustal paper and this paper detail the steps needed to 
parallelize it.  It's made somewhat easier by shared memory (much less 
development), but in those days, multiple gigabytes of shared memory 
were still pretty expensive.

Someone did an MPI-Clustal as well in 2003.  Anyone using that?

-- 
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics Inc.
email: landman at scalableinformatics.com
web  : http://scalableinformatics.com
        http://scalableinformatics.com/jackrabbit
phone: +1 734 786 8423 x121
fax  : +1 866 888 3112
cell : +1 734 612 4615