[Mauiusers] insufficient idle procs available ?
Itay M
itaym.tau at gmail.com
Mon Jan 21 14:19:12 MST 2008
Hi,
A user that has hi QOS submits a job but then his job gets to idle
state. There are 11 procs available, and some 20 other jobs in the Q
state in lower priotiry, however the job (id 191803) does not start.
It can take very long time until the job starts - even more than an
hour. I think that it only starts when a running job has ended, and
then the hi QOS jobs finally gets into R status. But I'm having some
troubles confirming this theory.
Question is: There are 11 procs available, why doesn't the job starts
immediatly? It only needs one proc., and there are 11 free procs, but
check job says 'insufficient idle procs available 0 < 1' .
Some details:
while using the checkjob we get:
//================//
checking job 191803
State: Idle
...
...
Flags: RESTARTABLE
PE: 1.00 StartPriority: 1004
job cannot run in partition DEFAULT (insufficient idle procs available: 0 < 1)
//================//
Using showq shows that are plenty of procs available:
//================//
showq
59 Active Jobs 77 of 88 Processors Active (87.50%)
//================//
Using showres shows as if all running jobs are reserved:
//================//
Showres
...
...
59 reservations located
//================//
Using showbf -A shows that there are no backfills.
//================//
showbf -A
backfill window (user: '[ALL]' group: '[ALL]' partition: ALL) Mon Jan
21 23:04:45
no procs available
//================//
And finally using diagnose -p shows
[root at biocluster ~]$ diagnose -p that this jobs has a higher priority
than other jobs in idle mode:
//================//
diagnose -p
diagnosing job priority information (partition: ALL)
Job PRIORITY* Cred(Group: QOS) Serv(QTime)
Weights -------- 1( 1: 2) 1( 1)
191802 1008 99.3(1000.: 0.0) 0.7( 7.5)
191772 431 0.0( 0.0: 0.0) 100.0(430.6)
...
...
...
191799 289 0.0( 0.0: 0.0) 100.0(289.4)
191800 289 0.0( 0.0: 0.0) 100.0(289.4)
Percent Contribution -------- 21.5( 21.5: 0.0) 78.5( 78.5)
//================//
Question is: There are 11 procs available, why doesn't the job starts
immediatly? It only needs one proc., and there are 11 free procs, but
check job says 'insufficient idle procs available 0 < 1' .
Thanks,
Itay.
More information about the mauiusers
mailing list