[Bioclusters] mpiblast w/ SGE problem?

Iddo Friedberg idoerg at gmail.com
Thu Oct 18 18:13:53 EDT 2007


Hi Chris,

Thanks. My replies are inlined:

On 10/18/07, Chris Dagdigian <dag at sonsorol.org> wrote:
>
>
> The biggest problem I see in that job script is that you are pointing
> mpiblast at a manually generated MPI machines file rather than the
> custom one that would get created by the pe_starter method in your
> parallel environment.


I actually tried both a manually supplied and the custom one.
the same happens with the following script:
--------------------------------


#!/bin/bash

#$ -cwd
#$ -j y
#$ -S /bin/bash

export MPI_DIR=/opt/mpich/gnu/
# export BLASTDB=/share/bio/ncbi/db/
#export BLASTDB=/home/thumper1/users/idoerg/databases/STRING
#export BLASTMAT=/opt/Bio/ncbi/data
export MPIBLAST_CONFIG=/home/thumper1/users/idoerg/mpiblast.conf
export THOME=/home/thumper1/users/idoerg
export P4_GLOBMEMSIZE=256000000
export $TMP=/tmp

$MPI_DIR/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines
/opt/Bio/mpiblast/bin/mpiblast -d protein.sequences.v7.0.fa \
        -i $THOME/databases/STRING/top_c60_test_100.tfa -m 7 -p blastp -o
$THOME/databases/STRING/top_c60_test_100.blast.xml

-------------------------------------------------------------------

When the SGE scheduler creates your custom machines file (telling
> mpiblast where to place the tasks) the location is here:
>
> $TMPDIR/machines
>
> So your mpirun command would have to contain:
>
> mpirun -np $NSLOTS -machinefile $TMPDIR/machines <rest of command ..>
>
> There is no point integrating a MPI app into Grid Engine unless the
> Grid Engine scheduler gets to pick which machines get the parallel
> task(s) :)
>
> I also don't see a "#$ -pe <parallel environment>" request within the
> SGE job script; were you requesting one via the qsub command?




Yes.

qsub -pe mpich 12 mpiblast_sge.sh

qstat actually shows that 12 slots are allocated. But it only runs on one
node!


The other thing to look out for is the format of the machines file
> that SGE creates -- it may or may not include the fully qualified
> domain name and it may or may not be in the exact format that your
> particular MPI installation expects. You can control the format of
> the machines file by just looking at the source code for the script
> that is being run as the pe_starter method within the configuration
> of your parallel environment.



i checked that, and tried ssh-ing to those nodes as they appeared  in the
machines file.. and the passwordless ssh worked.

??

thanks,

Iddo







-Chris
>
>
>
>
> On Oct 18, 2007, at 4:08 PM, Iddo Friedberg wrote:
>
> > hi,
> >
> > I am trying to run mpiblast on a ROCKS cluster using SGE. mpiblast
> > seems to
> > be running well, but all slots are being run on a single node for some
> > reason! Can anyone help? Full disclosure: newbie to mpi, mpiblast,
> > and SGE.
> >
> > Here is the command line I use:
> >
> > % qsub -pe mpich 10 mpiblast_sge.sh
> >
> > And here is the  shell script mpiblast_sge.sh
> > ----------------------------------------------------------------------
> > #!/bin/bash
> >
> > #$ -cwd
> > #$ -j y
> > #$ -S /bin/bash
> >
> > export MPI_DIR=/opt/mpich/gnu/
> > # export BLASTDB=/share/bio/ncbi/db/
> > export BLASTDB=/home/thumper1/users/idoerg/databases/STRING
> > export BLASTMAT=/opt/Bio/ncbi/data
> > export THOME=/home/thumper1/users/idoerg
> > export P4_GLOBMEMSIZE=256000000
> >
> > $MPI_DIR/bin/mpirun -np $NSLOTS -machinefile
> > /home/thumper1/users/idoerg/tmp/machines /opt/Bio/mpiblast/bin/
> > mpiblast -d
> > protein.sequences.v7.0.fa \
> >         -i $THOME/databases/STRING/top_c60_test_1000.tfa -m 7 -p
> > blastp -o
> > $THOME/databases/STRING/top_c60_test_1000.blast.xml
> >
> >
> > ----------------------------------------------------------------------
> > ---------------------------------------------------
> >
> >
> >
> > Thanks,
> >
> > Iddo
> >
> >
> >
> >
> >
> > --
> >
> > I. Friedberg
> >
> > "The only problem with troubleshooting is that
> > sometimes trouble shoots back."
> > _______________________________________________
> > Bioclusters maillist  -  Bioclusters at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bioclusters
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
>



-- 

I. Friedberg

"The only problem with troubleshooting is that
sometimes trouble shoots back."


More information about the Bioclusters mailing list