[Bioclusters] SGE and checkpointing

Ivo Grosse bioclusters@bioinformatics.org
Wed, 15 May 2002 13:49:03 -0400


Hi,

- does SGE support checkpointing?  How?

- if yes, is SGE capable of suspending low-priority jobs temporarily, 
when there are high priority jobs waiting in the queue?

Ivo


P.S.

I mean: in the standard implementation of SGE that we currently use, 
SGE is able to rearrange the jobs in the queue by priority.  That is, 
if you submit a low-priority job first, and I submit a high-priority 
second, and your job has not yet started, then my job will be executed 
first.  However, if your job has already started, then (in the 
implementation of SGE that we currently use) my job will have to wait 
till your job is done.  Can this be changed?

We would like to have the following solution:

User X submits 80 low-priority jobs, and each of them will run for 2 
weeks.  Since the queue is currently empty, all of the jobs get 
started.  Now user Y wants to submit just one high-priority job, which 
will run for only 1 day.  Unfortunately, in the current implementation 
of SGE, the job of user Y would have to wait for 2 weeks.  What we 
would like is that one of the low-priority jobs of X gets temporarily 
suspended, the high-priority job of user Y gets started, and after the 
job of Y is finished, the suspended job of user X is continued.  Is 
that possible?  How?