[torquedev] clearing exec_host on job requeue
Garrick Staples
garrick at clusterresources.com
Wed Feb 14 20:25:18 MST 2007
On Wed, Feb 14, 2007 at 09:37:29PM -0500, Glen Beane alleged:
> On 2/14/07, Garrick Staples <garrick at clusterresources.com> wrote:
> >CRI has a trouble ticket open about a job's exec_host not being cleared
> >when it is requeued. Apperently this annoys some sysadmins and breaks
> >some 3rd party things like clumon.
> >
> >I think I just found a bug that pre-dates TORQUE and is fixed with a
> >single character patch! I need others to look at this and tell me I'm
> >not crazy.
> >
> >I've already committed it to trunk, but this is trivial for 2.1 as well.
> >
> >$ svn diff -r1242:1243 src/server/req_jobobit.c
> >Index: src/server/req_jobobit.c
> >===================================================================
> >--- src/server/req_jobobit.c (revision 1242)
> >+++ src/server/req_jobobit.c (revision 1243)
> >@@ -1419,7 +1419,7 @@
> >
> > /* Now re-queue the job */
> >
> >- if ((pjob->ji_qs.ji_svrflags | JOB_SVFLG_HOTSTART) == 0)
>
> That's definitely a bug. whatever is inside that if block is never
> going to be executed since (pjob->ji_qs.ji_svrflags |
> JOB_SVFLG_HOTSTART) can never be zero. Good eye :) hard to believe
> how long that has been in there!
Assuming everyone is happy with the results, does this go into
2.1-fixes? I have a feeling this is going to have some wider-reaching
effects and don't want to break expected behaviour in 2.1.
More information about the torquedev
mailing list