[Mauiusers] insufficient idle procs available ?
Jan Ploski
Jan.Ploski at offis.de
Tue Jan 29 15:26:05 MST 2008
Itay M wrote:
> Here is the diagnose -j on these two jobs that are running on node28:
(looking as expected, 1 Proc, 1 class initializer from heavy consumed
for each job)
> And here is the checkjob -v on these two jobs:
(also nothing strange in there)
> what does the 0:4 means?
It means that according to Maui 0 out of 4 processors are available...
> Could this be related to the way in which the user is running the job
> itself (the one that qsub runs) ?
> Or should I check something in the nodes? something related to load
> average? else?
...It is very possible that it is related to the load average. Are these
2 jobs multithreaded? Is the load ~4 while it should be ~2? I think
maybe this message explains what is happening in your cluster:
http://www.clusterresources.com/pipermail/mauiusers/2002-February/000074.html
See also the description of NODEAVAILABILITYPOLICY in the Maui manual,
especially the default setting which says:
"The node is considered busy if either dedicated or utilized resources
equal or exceed configured resources"
So maybe Maui is simply not starting jobs because it correctly detects
that the processors are overcommitted. Perhaps your 'heavy' job users
should require nodes=1:ppn=2 if they are indeed causing load+=2 each.
> BTW, almost all of our jobs have the 'WARNING: job '{job_id}' utilizes
> more memory than dedicated (xxxx > 512) . Should I change the default
> memory assigned for the jobs? Currently the default is 512MB.
Well, apparently your 'heavy' jobs are consuming much more memory than
that. I think this may also be a reason why the new jobs are not getting
started - if they are requesting 512 MB, but there is not enough free
memory left. If you increase this requirement, Maui is likely to become
even more conservative about starting jobs. But this may be a sensible
thing to do - you don't want your jobs to overcommit memory and the node
to start swapping (check vmstat output on the node to see if it is not
swapping/paging already).
Regards,
Jan Ploski
More information about the mauiusers
mailing list