[Bioclusters] questions with mpiBLAST

Jason D. Gans bioclusters@bioinformatics.org
Wed, 07 May 2003 17:27:11 -0600


Jeremy Mann wrote:
> 
> I have several questions about mpiBLAST.
> 
> One, since mpiformatdb splits up each database according to the -N value,
> why does each node have several of these in its local storage?

How many nodes are you running mpiblast on? Notice that splitting up a
database
via mpiformatdb -N X ... produces X + 1 database fragments. If you are
running
on fewer than X + 2 nodes (X + 1 workers and 1 master) than at least one
node
will have to process more than one database fragments.
 
> Two, why the constant transferring of the database segments to individual
> nodes? In my tests, even if I copy every segment to each node, mpiBLAST
> still copies database segments.

I was under the impression that mpiblast first checks to see if the
correct 
database fragment is present in the local storage directory. If you run
mpiblast
with the --debug flag it should tell what fragments are being staged out
to the
worker nodes.

> Three, one and two mess things up when you have multiple users trying to
> use the system. Obviously the segments get copied to local storage owned
> by THAT user. If another user runs mpiblast, I get permission denied
> errors because it tries to copy over the existing users database segments.

Restrict each node to run only one blast job at a time (via PBS, condor,
etc). Multiple
blast jobs running on the same node will compete for memory and cpu
resources and 
remove the super linear scaling of mpiblast.

Regards,

Jason Gans