[Bioclusters] Parallel Sequence Alignment tool
lli at ufl.edu
Thu Aug 13 21:12:59 EDT 2009
You may want to try this (http://www.biotech.ufl.edu/people/sun/esprit.html
Sent from Li's iPhone
On Aug 13, 2009, at 8:30 PM, "Paulo Nuin" <nuin at genedrift.org> wrote:
> Just my two cents. Aligning rRNA is not a straightforward process and
> it shouldn't be attempted to be accomplished automatically. Muscle,
> MAFFT and other fast algorithms will generate very low quality
> alignments if it's done blindly. Based on the number of sequences you
> have, and their nature, you would be OK by wrapping some script around
> ClustalW or ClustalW-MPI.
> A good protocol to align rRNA is as follows:
> - align two sequences
> - add a third sequence to it by using the first two as a profile
> - add a fourth sequence using the first three as a profile
> - add a fifth sequence ...
> - at some point you will have a good enough profile that would allow
> you to use the aligned sequences as a model to the ones added to the
> The reason is rRNA has a secondary (and tertiary) structure that
> contains stems and loops. Stems are short segments that are somewhat
> "duplicated" along the flat sequence and attache to each other when
> forming the secondary structure. This connection sometimes don't
> follow the usual A-T(U) C-G connection. Due to the stems there is a
> pattern on the primary structure that has to be followed to generate a
> good (but not excellent) alignment.
> I guess a rRNA alignment software would be too slow for your
> requirements, but I guess by using ClustalW-MPI and some sequences as
> profile would you get a slightly good alignment in maybe a couple of
> Hope that helps
> On 30-Jul-09, at 12:19 PM, Nick Holway wrote:
>> Steve actually posted this on behalf of me, so to cut out the middle
>> man I'll answer.
>> I'm trying to assist a scientist with a bioinformatics project. He's
>> trying to align 16s rDNA sequences to identify the bacterial species.
>> I launched a Muscle job on his behalf which took ~5.5 days to run (on
>> 3GHz "Harpertown" Xeons). The file the scientist gave me had ~5000
>> sequences in which were mostly 1000-1500 bases long.
>> I'm trying to persuade the scientist to see if he can reduce the
>> number of sequences that he needs to align and also to see if his
>> needs to let Muscle run to completion rather than just the first two
>> My reason for wanting to know if there are any good parallel sequence
>> alignment tools is that we've seen some excellent speed increases
>> our MD code. Knowing this scientist I imagine he'll need the entire
>> data set to be aligned :)
>> If you need me to find out any more information from the scientist
>> please let me know.
>> 2009/7/22 Juan Carlos Perin <bic at genome.chop.edu>:
>>> Are you looking to align short reads from ngs, or other data?
>>> ~ juan
>>> On Jul 17, 2009, at 10:41, <slitster at rcn.com> wrote:
>>>> Does anyone have recommnedations for a parallel sequence alignment
>>>> User investigation so far has turned up clustalW-MPI, but it seams
>>>> to be
>>>> using an older version of clustalW.
>>>> Any imput much appreciated.
>>>> Bioclusters maillist - Bioclusters at bioinformatics.org
>>> Bioclusters maillist - Bioclusters at bioinformatics.org
>> Bioclusters maillist - Bioclusters at bioinformatics.org
> Bioclusters maillist - Bioclusters at bioinformatics.org
More information about the Bioclusters