[Bioclusters] Parallel Sequence Alignment tool
Antony P Joseph
antony at panathara.org
Sat Aug 1 13:29:33 EDT 2009
Did you try the -profile option in muscle as a divide-conquer
strategy on data assuming that you are not able to find the
parallelized version of MUSCLE.
number of files = no. of cpus
number of sequence in each file = 5000/ no. of CPUS
Nick Holway wrote:
> Steve actually posted this on behalf of me, so to cut out the middle
> man I'll answer.
> I'm trying to assist a scientist with a bioinformatics project. He's
> trying to align 16s rDNA sequences to identify the bacterial species.
> I launched a Muscle job on his behalf which took ~5.5 days to run (on
> 3GHz "Harpertown" Xeons). The file the scientist gave me had ~5000
> sequences in which were mostly 1000-1500 bases long.
> I'm trying to persuade the scientist to see if he can reduce the
> number of sequences that he needs to align and also to see if his data
> needs to let Muscle run to completion rather than just the first two
> My reason for wanting to know if there are any good parallel sequence
> alignment tools is that we've seen some excellent speed increases with
> our MD code. Knowing this scientist I imagine he'll need the entire
> data set to be aligned :)
> If you need me to find out any more information from the scientist
> please let me know.
> 2009/7/22 Juan Carlos Perin <bic at genome.chop.edu>:
>> Are you looking to align short reads from ngs, or other data?
>> ~ juan
>> On Jul 17, 2009, at 10:41, <slitster at rcn.com> wrote:
>>> Does anyone have recommnedations for a parallel sequence alignment tool
>>> User investigation so far has turned up clustalW-MPI, but it seams to be
>>> using an older version of clustalW.
>>> Any imput much appreciated.
>>> Bioclusters maillist - Bioclusters at bioinformatics.org
>> Bioclusters maillist - Bioclusters at bioinformatics.org
> Bioclusters maillist - Bioclusters at bioinformatics.org
More information about the Bioclusters