Hi Chris, Thanks. My replies are inlined: On 10/18/07, Chris Dagdigian <dag at sonsorol.org> wrote: > > > The biggest problem I see in that job script is that you are pointing > mpiblast at a manually generated MPI machines file rather than the > custom one that would get created by the pe_starter method in your > parallel environment. I actually tried both a manually supplied and the custom one. the same happens with the following script: -------------------------------- #!/bin/bash #$ -cwd #$ -j y #$ -S /bin/bash export MPI_DIR=/opt/mpich/gnu/ # export BLASTDB=/share/bio/ncbi/db/ #export BLASTDB=/home/thumper1/users/idoerg/databases/STRING #export BLASTMAT=/opt/Bio/ncbi/data export MPIBLAST_CONFIG=/home/thumper1/users/idoerg/mpiblast.conf export THOME=/home/thumper1/users/idoerg export P4_GLOBMEMSIZE=256000000 export $TMP=/tmp $MPI_DIR/bin/mpirun -np $NSLOTS -machinefile $TMPDIR/machines /opt/Bio/mpiblast/bin/mpiblast -d protein.sequences.v7.0.fa \ -i $THOME/databases/STRING/top_c60_test_100.tfa -m 7 -p blastp -o $THOME/databases/STRING/top_c60_test_100.blast.xml ------------------------------------------------------------------- When the SGE scheduler creates your custom machines file (telling > mpiblast where to place the tasks) the location is here: > > $TMPDIR/machines > > So your mpirun command would have to contain: > > mpirun -np $NSLOTS -machinefile $TMPDIR/machines <rest of command ..> > > There is no point integrating a MPI app into Grid Engine unless the > Grid Engine scheduler gets to pick which machines get the parallel > task(s) :) > > I also don't see a "#$ -pe <parallel environment>" request within the > SGE job script; were you requesting one via the qsub command? Yes. qsub -pe mpich 12 mpiblast_sge.sh qstat actually shows that 12 slots are allocated. But it only runs on one node! The other thing to look out for is the format of the machines file > that SGE creates -- it may or may not include the fully qualified > domain name and it may or may not be in the exact format that your > particular MPI installation expects. You can control the format of > the machines file by just looking at the source code for the script > that is being run as the pe_starter method within the configuration > of your parallel environment. i checked that, and tried ssh-ing to those nodes as they appeared in the machines file.. and the passwordless ssh worked. ?? thanks, Iddo -Chris > > > > > On Oct 18, 2007, at 4:08 PM, Iddo Friedberg wrote: > > > hi, > > > > I am trying to run mpiblast on a ROCKS cluster using SGE. mpiblast > > seems to > > be running well, but all slots are being run on a single node for some > > reason! Can anyone help? Full disclosure: newbie to mpi, mpiblast, > > and SGE. > > > > Here is the command line I use: > > > > % qsub -pe mpich 10 mpiblast_sge.sh > > > > And here is the shell script mpiblast_sge.sh > > ---------------------------------------------------------------------- > > #!/bin/bash > > > > #$ -cwd > > #$ -j y > > #$ -S /bin/bash > > > > export MPI_DIR=/opt/mpich/gnu/ > > # export BLASTDB=/share/bio/ncbi/db/ > > export BLASTDB=/home/thumper1/users/idoerg/databases/STRING > > export BLASTMAT=/opt/Bio/ncbi/data > > export THOME=/home/thumper1/users/idoerg > > export P4_GLOBMEMSIZE=256000000 > > > > $MPI_DIR/bin/mpirun -np $NSLOTS -machinefile > > /home/thumper1/users/idoerg/tmp/machines /opt/Bio/mpiblast/bin/ > > mpiblast -d > > protein.sequences.v7.0.fa \ > > -i $THOME/databases/STRING/top_c60_test_1000.tfa -m 7 -p > > blastp -o > > $THOME/databases/STRING/top_c60_test_1000.blast.xml > > > > > > ---------------------------------------------------------------------- > > --------------------------------------------------- > > > > > > > > Thanks, > > > > Iddo > > > > > > > > > > > > -- > > > > I. Friedberg > > > > "The only problem with troubleshooting is that > > sometimes trouble shoots back." > > _______________________________________________ > > Bioclusters maillist - Bioclusters at bioinformatics.org > > https://bioinformatics.org/mailman/listinfo/bioclusters > > _______________________________________________ > Bioclusters maillist - Bioclusters at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters > -- I. Friedberg "The only problem with troubleshooting is that sometimes trouble shoots back."