[Bioclusters] mpiblast on the smallest possible cluster

Fernan Aguero bioclusters@bioinformatics.org
Wed, 28 Jan 2004 11:05:07 -0300


+----[ Joe Landman <landman@scalableinformatics.com> (28.Jan.2004 03:36):
|
| Hi Fernan

Hi Joe, and thanks for your reply.
 
| Fernan Aguero wrote:
| 
| >All the tests I've performed (from reading the docs) pass.
| >However, all the jobs run on the slave. The master just sits
| >down quietly waiting for the slave to finish with one
| >segment, before feeding the next. I suppose that this
| > 
| >
| 
| Are you using mpirun with the -nolocal option?  The -nolocal option 
| prevents running threads on the starting node.  Also, what does your 
| machines file look like?  Are you using a machines file or just "-np 2" ?
|
+----]

No, I'm not using the -nolocal option.

I am using a machines file. The machines file is in
/usr/local/mpich/share/machines.freebsd and has just two
entries, using FQDNs. Both machines are in the same network,
of course. Should I list the master as 'localhost' instead?

When I tested my mpich installation using the examples
provided, I verified that using -np 2 would run the process
in the two hosts:
[fernan@pi] mpirun -np 2 cpi
Process 1 on rho.iib.unsam.edu.ar
Process 0 on pi.iib.unsam.edu.ar
pi is approximately 3.1416009869231241, Error is 0.0000083333333309
wall clock time = 0.004861

I the formatted a database using mpiformatdb. Since the db
is small enough I just splitted in two segments.
[fernan@pi] mpiformatdb -N 2 -i sp_tr_nrdb/sprot.fasta -p T -o T
Reading input file |
Done, read 1060238 lines
Trying to break sprot.fasta (52 MB) into 2 fragments of 26 MB
Executing: formatdb -i sp_tr_nrdb/sprot.fasta -p T -o T -v 26 -n /databases/sprot.fasta 
Created 2 fragments.

My .mpiblastrc is the same in both hosts, and contains the following:
/databases
/usr/local/databases

In the master host all databases are in
/usr/local/databases, which is exported to the other host
(rho). 
In the master host /databases is just a symbolic link to
/usr/local/databases. 
In rho, /databases is the mount point of the shared data,
and /usr/local/databases is the local storage for mpiblast
to use.

All these directories are readable and writable by me.
The NCBI data directory is also in the shared area, under a
'blastmat' directory, and the .ncbirc is the same in both
hosts and contains the following:
[NCBI]
Data=/databases/blastmat

Now I run a query using mpiblast: 
[fernan@pi] mpirun -np 2 /usr/local/bin/mpiblast --debug -i query_1.fa -d sprot.fasta -p blastp -o query.mpiblast 

I follow the run on two separate xterms, running 'top' on
each host. In the master host I don't notice any increment
in the CPU usage activity. In fact I don't even see mpirun
listed while running  :|
On the 'slave' host I notice mpiblast running, with a CPU
usage of 96-97% during the whole run.
This is also reflected in the mpiblast output that I get
using --debug (attached to this message).

Also if I use mpirun -v I see:
running /usr/local/bin/mpiblast on 2 386BSD ch_p4 processors
Created /usr/local/databases/PI63127

Any ideas?

Thanks again,

Fernan

-- 
F e r n a n   A g u e r o
http://genoma.unsam.edu.ar/~fernan