[Moabusers] Major problem with SR
Brock Palen
brockp at umich.edu
Fri Jul 21 12:10:42 MDT 2006
We have a sr defined like so:
SRCFG[csem] NODEFEATURES=csem
SRCFG[csem] CLASSLIST=csem
SRCFG[csem] TASKCOUNT=28
SRCFG[csem] PERIOD=DAY
SRCFG[csem] DEPTH=10
SRCFG[csem] ACCESS=DEDICATED
SRCFG[csem] FLAGS=IGNSTATE
Problem though is jobs owned by a class other than csem is being
placed on the nodes, causing those node to not be available. We are
pared with torque-2.1.1 a check job on one of the jobs that should
not be on the nodes is below
job 1617
AName: R2_12x36
State: Running
Creds: user:hcarlo group:ioe class:short
WallTime: 2:53:45 of 10:55:00
SubmitTime: Fri Jul 21 11:14:32
(Time Queued Total: 00:01:30 Eligible: -00:00:01)
StartTime: Fri Jul 21 11:16:02
Total Requested Tasks: 1
Req[0] TaskCount: 1 Partition: nyx
Memory >= 0 Disk >= 0 Swap >= 0
Opsys: --- Arch: --- Features: ---
Allocated Nodes:
[nyx337:1]
StartCount: 1
Flags: BACKFILL,RESTARTABLE
Attr: BACKFILL,checkpoint
StartPriority: 10451
Reservation '1617' (-2:56:00 -> 7:59:00 Duration: 10:55:00)
Checknode also shows the node as the current active csem reservation
was active on it before the job started on it!
Output
checknode nyx337
snip....
csem.1029x1 User -14:10:55 -> 9:49:05 (1:00:00:00)
Blocked Resources at -00:00:44 Procs: 4/4 (100.00%) Mem: 0/3901
(0.00%) Swap: 0/7710 (0.00%) Disk: 0
snip...
1617x1 Job:Running -2:56:30 -> 7:58:30 (10:55:00)
This job should not be on this node what could be causing this problem?
Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
More information about the moabusers
mailing list