[Bioclusters] SGE jobs staying in dr state

Shane Brubaker brubaker2 at llnl.gov
Wed Mar 30 13:20:58 EST 2005


Hi, I have some SGE jobs which stay in a "dr" state and will not go 
away.  I have issued a qdel command on these jobs, so they are in a 
"deleted, running"
state.  Usually such jobs will go away after a few minutes, but these 
won't.  I also can't delete that queue now because it has jobs in it.

These happened to be fairly long jobs that ran a day or two.  Also, these 
jobs do not show up on the actual nodes, so they aren't really running 
anymore.  They only
appear in qstat.

Any help would be much appreciated.


Thank You,
Shane Brubaker
JGI

At 09:09 AM 3/30/2005, you wrote:
>Send Bioclusters mailing list submissions to
>         bioclusters at bioinformatics.org
>
>To subscribe or unsubscribe via the World Wide Web, visit
>         https://bioinformatics.org/mailman/listinfo/bioclusters
>or, via email, send a message with subject or body 'help' to
>         bioclusters-request at bioinformatics.org
>
>You can reach the person managing the list at
>         bioclusters-owner at bioinformatics.org
>
>When replying, please edit your Subject line so it is more specific
>than "Re: Contents of Bioclusters digest..."
>
>
>Today's Topics:
>
>    1. Re: alternative DHCP implementations? (jason.calvert at novartis.com)
>    2. Re: alternative DHCP implementations? (Lars G. T. Jorgensen)
>
>
>----------------------------------------------------------------------
>
>Message: 1
>Date: Wed, 30 Mar 2005 11:43:17 -0400
>From: jason.calvert at novartis.com
>Subject: Re: [Bioclusters] alternative DHCP implementations?
>To: "Clustering,        compute farming & distributed computing in life
>         science informatics"    <bioclusters at bioinformatics.org>
>Message-ID:
> 
><OFE1C386E6.4CD361C1-ON85256FD4.00560E7A-85256FD4.00565C5B at EU.novartis.net>
>
>Content-Type: text/plain; charset="us-ascii"
>
>There are scripts within the OSCAR release to do this for you.  You can
>start the scripts, power on the nodes in the order you wish, and then
>assign them to auto generated hostnames.  The scripts output the dhcp.conf
>file.
>
>I would think you could pull them out of oscar pretty easily.
>
>Jason
>
>
>
>
>Chris Dagdigian <dag at sonsorol.org>
>Sent by:
>bioclusters-bounces+jason.calvert=pharma.novartis.com at bioinformatics.org
>03/29/2005 02:27 PM
>Please respond to "Clustering,  compute farming & distributed computing in
>life science informatics"
>
>
>         To:     adamm at menlo.com, "Clustering,  compute farming & 
> distributed computing in
>life science informatics" <bioclusters at bioinformatics.org>
>         cc:     (bcc: Jason Calvert/PH/Novartis)
>         Subject:        Re: [Bioclusters] alternative DHCP implementations?
>
>
>
>
>Agreed. It was just a shortcut. We already do allocation of IP based on
>MAC address but that only works when you know the MAC address
>information ahead of time.  This is rare especially on whitebox cluster
>projects where people don't put the MAC on the product packaging or on
>the chassis itself. Some vendors do a good job of making the data easy
>to find and others simply don't bother.
>
>A dhcp server handing out dynamic-range leases in a predictable manner
>is what allowed us to easily map MAC address to node position and
>nodename simply by powering on the nodes for PXE boot in the order in
>which they are racked and stacked. Once this was done we had the
>MAC->Node mapping data we needed to generate the static allocation
>entries.
>
>A workaround for non-predictable allocation is to simply power on the
>cluster in the order in which you want things named, then parse the
>dhcpd leases file for both the MAC address *and* the timestamp
>representing the lease handout. That would allow you to map MAC -> Node
>without having to care about hostnames for the first pass MAC collection
>phase.  Then you build the static-by-mac entries into the conf file and
>problem solved. If we stick with ISC DHCP this is a possibility...
>
>c
>
>Adam S. Moskowitz wrote:
> > Chris,
> >
> >
> >>We are thinking about trying to find a replacement DHCP server that has
> >>a predictable method of allocating dynamic IP addresses (even if only
> >>for the initial cluster deployment)
> >
> >
> > I think it's a bad idea to rely on such behavior. I don't remember what
> > the RFC says, but in general, unless the RFC guarantees an
> > implementation should behave a particular way, you are asking for
> > trouble to rely on specific behavior.
> >
> > A great example of this is how round-robin DNS used to work and then how
> > it changed and lots of things broke.
> >
> > DHCP isn't meant to do what you're asking it to do, so I strongly
> > suggest you not use it to solve that particular problem.
> >
> > That said, DHCP supports a mechanism for binding specific IP addresses
> > to specific MAC addresses, even though the assignment is still done
> > dynamically. Yes, this is a bit more work, but at least it's guaranteed
> > behavior.
>_______________________________________________
>Bioclusters maillist  -  Bioclusters at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
>
>
>______________________________________________________________________
>The Novartis email address format has changed to
>firstname.lastname at novartis.com.  Please update your address book
>accordingly.
>______________________________________________________________________
>-------------- next part --------------
>An HTML attachment was scrubbed...
>URL: 
>http://bioinformatics.org/pipermail/bioclusters/attachments/20050330/217cb3f0/attachment-0001.htm
>
>------------------------------
>
>Message: 2
>Date: Wed, 30 Mar 2005 14:55:55 +0200
>From: "Lars G. T. Jorgensen" <lars at binf.ku.dk>
>Subject: Re: [Bioclusters] alternative DHCP implementations?
>To: "Clustering,        compute farming & distributed computing in life
>         science informatics"    <bioclusters at bioinformatics.org>
>Message-ID: <424AA1DB.8000702 at binf.ku.dk>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>Chris Dagdigian wrote:
>
> >
> >
> > Agreed. It was just a shortcut. We already do allocation of IP based
> > on MAC address but that only works when you know the MAC address
> > information ahead of time.  This is rare especially on whitebox
> > cluster projects where people don't put the MAC on the product
> > packaging or on the chassis itself. Some vendors do a good job of
> > making the data easy to find and others simply don't bother.
> >
> > A dhcp server handing out dynamic-range leases in a predictable manner
> > is what allowed us to easily map MAC address to node position and
> > nodename simply by powering on the nodes for PXE boot in the order in
> > which they are racked and stacked. Once this was done we had the
> > MAC->Node mapping data we needed to generate the static allocation
> > entries.
> >
> > A workaround for non-predictable allocation is to simply power on the
> > cluster in the order in which you want things named, then parse the
> > dhcpd leases file for both the MAC address *and* the timestamp
> > representing the lease handout. That would allow you to map MAC ->
> > Node without having to care about hostnames for the first pass MAC
> > collection phase.  Then you build the static-by-mac entries into the
> > conf file and problem solved. If we stick with ISC DHCP this is a
> > possibility...
>
>Or buy an switch with a bit of inteligence that can dump the machines
>MAC addresses based on ports.
>
> >
> > c
> >
> > Adam S. Moskowitz wrote:
> >
> >> Chris,
> >>
> >>
> >>> We are thinking about trying to find a replacement DHCP server that
> >>> has a predictable method of allocating dynamic IP addresses (even if
> >>> only for the initial cluster deployment)
> >>
> >>
> >>
> >> I think it's a bad idea to rely on such behavior. I don't remember what
> >> the RFC says, but in general, unless the RFC guarantees an
> >> implementation should behave a particular way, you are asking for
> >> trouble to rely on specific behavior.
> >>
> >> A great example of this is how round-robin DNS used to work and then how
> >> it changed and lots of things broke.
> >>
> >> DHCP isn't meant to do what you're asking it to do, so I strongly
> >> suggest you not use it to solve that particular problem.
> >>
> >> That said, DHCP supports a mechanism for binding specific IP addresses
> >> to specific MAC addresses, even though the assignment is still done
> >> dynamically. Yes, this is a bit more work, but at least it's guaranteed
> >> behavior.
> >
> > _______________________________________________
> > Bioclusters maillist  -  Bioclusters at bioinformatics.org
> > https://bioinformatics.org/mailman/listinfo/bioclusters
> >
>
>
>--
>Mvh|Regards, Lars
>System Administrator, Phone: 3532 1349, Room: 318
>
>Bioinformatics Centre, University of Copenhagen,
>Universitetsparken 15
>DK-2100 Copenhagen
>DENMARK
>
>
>
>
>
>------------------------------
>
>_______________________________________________
>Bioclusters maillist  -  Bioclusters at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
>End of Bioclusters Digest, Vol 5, Issue 28
>******************************************



More information about the Bioclusters mailing list