[Bioclusters] blat on SGE

Chris Dagdigian dag at sonsorol.org
Mon Apr 4 11:43:07 EDT 2005


There are three main ways to run a non-parallel binary under the control 
of Grid Engine.

First off you need to have SGE up and running

Then you install your application binary (blat) on every node in the 
cluster in some sort of local bin directory *or* you can place it into a 
shared NFS directory.

A good test to see if SGE is working at a basic level is just to 
repeatedly run the "qrsh hostname" command to verify that you can run 
"/bin/hostname" on remote cluster nodes.


The three how-do-I-run-a-job methods are:
============================================

1. Straight qrsh

  The SGE "qrsh" command will bypass the scheduler and will instantly 
run your command on the least loaded node in the cluster. It will also 
silently (or almost silently) fail if there are no free job slots so you 
can't use this in an automated pipeline without careful error checking.

The syntax is:

  $ qrsh /path/to/blat <blat commandline args> <blat files>


Generally speaking if you are sharing a cluster with others or have many 
jobs to run you do not want to use this method. You should use "qsub" 
and let Grid Engine handle the resource allocation and remote job 
execution.


2. Straight qsub

SGE "qsub" is the main command that submits your job to the scheduler. 
In previous versions of SGE you could only submit shell scripts and not 
standalone binaries. This limitation has gone away as of SGE 6.0 and 
later. The basic bare-bones syntax is:

  $ qsub -b y /path/to/blat <blat commandline args> <blat files>

** Read the manpage for qsub so you know how to redirect your output and 
error files to places where you can read them. Because 'qsub' is the 
main job submission tool it take an insane number of very powerful 
commmandline arguments.


3. Fancy stuff

  Not worth detailing until you are comfortable with #1 and #2 but you 
may want to read up on "array jobs" within Grid Engine as they are very 
useful at solving the "how do I launch 100,000 blat jobs efficiently" 
use-case


The grid engine documentation is your friend:
v6.0x docs: http://docs.sun.com/app/docs/coll/1017.3?q=n1ge


-Chris



N.hebraeus N.hebraeus wrote:

> Hi,
>  
> I am interested in setting blat on Sun Grid Engine (on Linux cluster). I would appreciate any comments on how to do that.
>  
> thank you.
> 



More information about the Bioclusters mailing list