[Bioclusters] SGE and preemption

Ivo Grosse bioclusters@bioinformatics.org
Wed, 15 May 2002 15:16:40 -0400


Hi again,

this is a follow-up on my last SGE question (Subject: SGE and 
checkpointing).  I did some naive web search, and I think the feature 
we want is termed "preemption."  Is that correct?

http://www.supercluster.org/documentation/maui/8.4preemption.html

Maui is a scheduler that has that preemption feature, and I read that 
SGE can be combined with Maui.  Is that correct?

http://www.supercluster.org/documentation/maui/sgeintegration.html

Of course, if SGE (without Maui) could do preemption, that would be 
great.  Can it do that?  How?

Best regards, Ivo


+++


From: Ivo Grosse <grosse@cshl.org>
Organization: Cold Spring Harbor Laboratory
To: bioclusters@bioinformatics.org
Subject: [Bioclusters] SGE and checkpointing

Hi,

- does SGE support checkpointing?  How?

- if yes, is SGE capable of suspending low-priority jobs temporarily, 
when there are high priority jobs waiting in the queue?

Ivo


P.S.

I mean: in the standard implementation of SGE that we currently use, 
SGE is able to rearrange the jobs in the queue by priority.  That is, 
if you submit a low-priority job first, and I submit a high-priority 
second, and your job has not yet started, then my job will be executed 
first.  However, if your job has already started, then (in the 
implementation of SGE that we currently use) my job will have to wait 
till your job is done.  Can this be changed?

We would like to have the following solution:

User X submits 80 low-priority jobs, and each of them will run for 2 
weeks.  Since the queue is currently empty, all of the jobs get 
started.  Now user Y wants to submit just one high-priority job, which 
will run for only 1 day.  Unfortunately, in the current implementation 
of SGE, the job of user Y would have to wait for 2 weeks.  What we 
would like is that one of the low-priority jobs of X gets temporarily 
suspended, the high-priority job of user Y gets started, and after the 
job of Y is finished, the suspended job of user X is continued.  Is 
that possible?  How?