[Bioclusters] mpiBLAST configuration issues

Mon, 29 Mar 2004 14:39:09 +0100

-----Original Message-----
From: Micha Bayer [mailto:michab@dcs.gla.ac.uk]
Sent: 29 March 2004 13:54
To: bioclusters@bioinformatics.org
Subject: Re: [Bioclusters] mpiBLAST configuration issues

Hi Lucas,

do you mean I should treat the cluster as though it only consists of 6
nodes (because we have the three dual processor nodes reserved for short
jobs)? But that would not make use of the other nodes when they are
free?

I think -np 6 means that the mpirun creates 6 processes for your
application.

If PBS treats my multi-sequence query file as a single job, this means
that on our cluster the job will go to a single node as soon as there is
one available and then the job will run there. What happens if this
really is the only node available? Will mpiBLAST then run everything on
this single node sequentially?

I do not think so. Your PBS server will wait until it finds enough number of
nodes to run your application. Try to use the configuration syntax: 'PBS -l'
in your PBS job.

Richard 

cheers

Micha

On Mon, 2004-03-29 at 13:37, Lucas Carey wrote:
> On Monday, March 29, 2004 at 12:58 +0100, Micha Bayer wrote:
> > Hi,
> > We have three nodes reserved for jobs of less than one hour's wall time.
> > I am part of the bio group and we have a share of 20% of the total
> > compute time on this cluster. Jobs get submitted and queued via the
> > OpenPBS batch system. The queue priority is worked out by a formula
> > which among other things takes into account recent usage (if you had
> > lots of jobs recently you get penalised) and job size (if your job is
> > small it gets a higher priority).
> > 
> > Questions:
> > 
> > 1. How many database fragments should I generate?
> You should generate 5 fragments, and always run with '-np 6'. If you want
instead to run with a variable number of CPUs (<= 6) creating 15 fragments
should give you the ability to do so with good load-balancing. There is a
small performance hit moving from 5->15 fragments, but 15 could be faster
depending on both the database and queries. 
> > 
> > 2. How will the spasmodic traffic on the cluster affect the performance
> > of mpiBLAST? 
> Once the fragments are distributed to the nodes it shouldn't matter at
all. If you keep running queries against the same database(s) and the
fragments remain on local storage on those 3 nodes, mpiBLAST does very
little communication.
> > 
> > 3. How are jobs partitioned for queuing with PBS (given an input file
> > with one sequence and a different scenario where the input file contains
> > multiple query sequences)?
> One 'run' of mpiBLAST will process an entire query file with multiple
individual queries. PBS views this as a single job, no matter how many
individual queries the file contains.
> > 
> > 4. When I issue the mpirun command and I specify the number of nodes to
> > be used, what does that do? Will this actually work on a cluster like
> > this where I don't have any control over the scheduling process?
> In the documentation a node refers to a CPU.  As far as both mpiBLAST and
PBS are concerned, your cluster has 6 nodes reserved for short jobs.
> 
> -Lucas
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters

_______________________________________________
Bioclusters maillist  -  Bioclusters@bioinformatics.org
https://bioinformatics.org/mailman/listinfo/bioclusters