[Moabusers] Invalid job running in SR

Matthew Britt msbritt at umich.edu
Fri Aug 25 08:09:02 MDT 2006


We're trying to figure out how certain jobs are running in SRs when  
they aren't supposed to be.  We've defined queue-based acls in torque  
to limit which users can submit jobs, the defined SRs which require  
that class.  In this case, the queue/class is called violi.

[root at cac-admin02 log]# mdiag -r violi.2254
Diagnosing Reservations
RsvID                      Type Par   StartTime     EndTime      
Duration Node Task Proc
-----                      ---- ---   ---------     -------      
-------- ---- ---- ----
violi.2254                 User nyx -1:17:47:48  1:14:18:55    
3:08:06:43   16   16   32
     Flags: STANDINGRSV,SPACEFLEX,DEDICATEDRESOURCE,ISACTIVE
     ACL:   RSV==violi.2254= CLASS==violi+
     CL:    RSV==violi.2254
     FLIST=cac
     Task Resources: PROCS: 2
     Active PH: 1996.66/4126.19 (48.39%)
     SRAttributes (TaskCount: 16  StartTime: 00:00:00  EndTime:  
1:00:00:00  Days: ALL)
     Rsv-Group: violi

Class definition in moab:
SRCFG[violi]    CLASSLIST=violi
SRCFG[violi]    NODEFEATURES=cac
SRCFG[violi]    RESOURCES=PROCS:2
SRCFG[violi]    TASKCOUNT=16
SRCFG[violi]    PERIOD=WEEK
SRCFG[violi]    DEPTH=2
SRCFG[violi]    ACCESS=DEDICATED
SRCFG[violi]    FLAGS=SPACEFLEX,DEDICATEDRESOURCE


The job in question was submitted after the SR started, so it  
couldn't have had a reservation prior to the SR being created.
Job Id: 14978.nyx.engin.umich.edu
     Job_Name = wh8x90y30
    [snip]
    qtime = Thu Aug 24 18:37:29 2006

This problem only seems to happen on SRs that are free-floating.  We  
use the cac feature to define nodes we own (as opposed to privately  
owned), so reservations such as these can slide around and maintain  
their taskcounts in the case a node goes down.  Our node-locked  
reservations (using either a HOSTLIST or a feature limited to a set  
of privately-held nodes) are not violated.

Any ideas on how to debug why certain jobs are allowed to use these  
reserved nodes or why this would be happening?

Thanks,
  - matt





More information about the moabusers mailing list