[Moabusers] Major problem with SR

Brock Palen brockp at umich.edu
Fri Jul 21 12:10:42 MDT 2006


We have a sr defined like so:

SRCFG[csem]     NODEFEATURES=csem
SRCFG[csem]     CLASSLIST=csem
SRCFG[csem]     TASKCOUNT=28
SRCFG[csem]     PERIOD=DAY
SRCFG[csem]     DEPTH=10
SRCFG[csem]     ACCESS=DEDICATED
SRCFG[csem]     FLAGS=IGNSTATE

Problem though is jobs owned by a class other than csem is being  
placed on the nodes,  causing those node to not be available.  We are  
pared with torque-2.1.1  a check job on one of the jobs that should  
not be on the nodes is below

job 1617

AName: R2_12x36
State: Running
Creds:  user:hcarlo  group:ioe  class:short
WallTime:   2:53:45 of 10:55:00
SubmitTime: Fri Jul 21 11:14:32
   (Time Queued  Total: 00:01:30  Eligible: -00:00:01)

StartTime: Fri Jul 21 11:16:02
Total Requested Tasks: 1

Req[0]  TaskCount: 1  Partition: nyx
Memory >= 0  Disk >= 0  Swap >= 0
Opsys:   ---  Arch: ---  Features: ---

Allocated Nodes:
[nyx337:1]


StartCount:     1
Flags:          BACKFILL,RESTARTABLE
Attr:           BACKFILL,checkpoint
StartPriority:  10451
Reservation '1617' (-2:56:00 -> 7:59:00  Duration: 10:55:00)


Checknode also shows the node as the current active csem reservation  
was active on it before the job started on it!
Output

checknode nyx337

snip....
csem.1029x1  User  -14:10:55 -> 9:49:05 (1:00:00:00)
     Blocked Resources at -00:00:44   Procs: 4/4 (100.00%)  Mem: 0/3901  
(0.00%)  Swap: 0/7710 (0.00%)  Disk: 0
snip...
1617x1  Job:Running  -2:56:30 -> 7:58:30 (10:55:00)

This job should not be on this node what could be causing this problem?

Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985




More information about the moabusers mailing list