[Bioclusters] Need some help with MrBayes and mpi

Fri Apr 8 15:05:10 EDT 2005

With respect to parallel or serial Mr. Bayes, I'd like to add that the 
parallelization may be somewhat different than what your users are 
accustomed to, unless they are acquainted with other stochastic methods 
in computational statistics.  The MPI version of the program runs 
multiple explorations of the the problem's solution space on multiple 
nodes in your cluster, each with a slightly different tolerance for 
downhill moves, and each exchanging solutions with neighbors 
periodically (Metropolis coupled MCMC, which is similar to temperature 
parallel simulated annealing).  Practically speaking, users will only 
sample output from the first chain (running on the first compute node), 
which gives a representation of the target posterior distribution if 
run long enough.  The parallel chains act as 'helpers' to accelerate 
convergence to the target stationary distribution/s, and do not 
necessarily lead to any kind of linear decrease in the overall runtime. 
  Particularly if the target distribution is simple, adding unnecessary 
chains (more compute nodes) can end up having no effect on the time to 
an adequate solution, beyond the duration of the initial burn-in.

Also to clarify, Mr. Bayes only does MCMC, so the parallelization would 
indeed affect all possible uses of the program.

Have you been able to run other parallel grid engine jobs without 
trouble?  Our only problem with getting MPI MrB running was with 
working out the kinks with parallel job queues on SGE.  After that, Mr. 
Bayes worked like a charm for us.

Cheers,
- Jason de Koning,
   University at Albany

On Apr 7, 2005, at 6:31 PM, Shaila Parashar wrote:

> Hi
>
> I have forwarded your email to our users to find out if they really 
> need mpi.
> Regarding he installation of mrbayes and mpich - I did both of them 
> from source.
> The reason we considered mpi was that we have a server with 24 
> processors and 96 GB of RAM.  The processors themselves have a speed 
> of 750 MHz. So, when the user ran a job, it was running on one 
> processor and was not as fast as we had expected. So it was suggested 
> to use mpi so that a number of processors could be used 
> simultaneously. Also we are using the Grid engine. In fact we will be 
> submitting the job to grid engine which will in turn use the mpich 
> parallel environment and run it on mpi.
>
> Any other ideas/suggestions would be greatly appreciated
>
> Shaila
>
>
> Chris Dagdigian wrote:
>
>>
>> First off find out if your users really need the MPI version of 
>> MrBayes -- some people ask for it just because it sounds cool yet 
>> they have no idea why or how they are going to benefit.
>>
>> From reading the documentation (I'm not a phylogeny expert) it 
>> appears that only the MCMC analysis stage can take advantage of MPI. 
>> If your users are not invoking MCMC in their MrBayes scripts then 
>> using MPI is pointless.
>>
>> Among the people I know of using MrBayes on clusters probably half 
>> are using cluster scheduling software like Grid Engine to simply run 
>> multiple instances of standalone non-MPI MrBayes across cluster 
>> nodes. The other 50% have found the MPI enabled version more useful 
>> for their particular workloads.
>>
>> It is unclear from your email if you compiled MrBayes yourself using 
>> the mpicc tools that are customized to your specific mpich-1.2.6 
>> installation. If you just dropped someone else's prebuilt MrBayes-MPI 
>> binary on one of your systems it is highly unlikely that it will work 
>> at all.
>>
>> -Chris
>>
>>
>>
>>
>>
>> Shaila Parashar wrote:
>>
>>> Hi
>>>
>>> I am not sure whether this is the right mailing list to address this 
>>> issue. Actually, I am a unix sytem admin, who has been requested to 
>>> install MrBayes on our cluster.
>>> We have 4 SUN machines running Solaris 9.  We have mpich-1.2.6 
>>> running and is being used extensively. We are not using passwordless 
>>> ssh , but instead are using  rsh. Students have not faced any 
>>> problems in using mpich.
>>> Now I have installed the mpi version of MrBayes on one of the 
>>> machines and then ran a program called sample.nex
>>> It gave me the following error
>>>
>>> p1_9962:  p4_error: interrupt SIGSEGV: 11
>>>
>>> I have not been able to find the reason and solution for that error.
>>> Any suggestions/ideas would be greatly appreciated. A lot of our 
>>> research work depends on getting this to work correctly.
>>>
>>> If this is not the right mailing list, can any of you please direct 
>>> me to the right mailing list to address this issue.
>>>
>>> Thanks in advance
>>>
>>> Shaila
>>>
>>>
>>
>
> -- 
> *****************************************************************
> Shaila Parashar			e-mail:shaila at engr.colostate.edu
> UNIX System Administrator	tel:- (970)-491-6555
> Engineering Network Services
> Colorado State University
> Fort Collins, CO 80523-1301
> ******************************************************************
> " Smile is a curve that sets things straight. "
>
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>