[torqueusers] limiting resource usage with torque

Arnau Bria listsarnau at gmail.com
Thu Dec 15 04:28:48 MST 2011


Hi all,

We were testing how to limit and request resource usage with torque.
Doc, and some docs I found on the net, said that defining resources_max
at queue level is enough for limitng resource usage:

* pag 62 of torque doc v 3.0.0
resource_max
Specifies the maximum resource limits for jobs submitted to the queue

So, we did something like :

resources_max.vmem=6gb

Also, after configuring 'size [fs=/home]' on all nodes, we added some
default resource request (disk free space) at submitfilter level:

 line="#PBS -l file=30gb -c n"

from mnan:
 -l resource_list
               Defines  the  resources  that are required by the job
and establishes a limit to the amount of resource that can be consumed.

jobs were submitted with :

    Resource_List.file = 30gb
    Resource_List.neednodes = 1
    Resource_List.nodect = 1
    Resource_List.nodes = 1
    Resource_List.pvmem = 6000mb


which seemed to work fine, but after some jobs started running, we
noticed that nodes were not running all the jobs they were supposed to,
although being in free state. 
I.e, a node with 24gb os mem (PHYS+SWAP) using only 12gb of mem did not
run more than 4 jobs when 8 was its limit. So, if it had free resources
why is it not running more jobs?

After some debugging we found the source. MAUI was reserving 6gb of mem
for each job. so, 4 jobs*6gb of mem = 24gb. All the mem was reserved
for those 4 jobs and the node is not selected for running more.

from checknode:
[...]
Configured Resources: PROCS: 8  MEM: 15G  SWAP: 23G  DISK: 122G
Utilized   Resources: SWAP: 5048M  DISK: 35G
Dedicated  Resources: PROCS: 4  SWAP: 23G  DISK: 30G
[...]

And we suppose that something similar was going to happen with DISK
resource if more jobs start (yep, we have some node with low disk
space).


So, did we understand correctly the resource.max parameter and -l qsub
option? Why that maui resource reservation?
Maybe this question should go to maui list, but for not
double-posting (yet), may we avoid maui reservation of resources?
How are other admins limiting VMEM usage per job?
How may we request some disk space available?


Many thanks in advance, and specially to them who read till here ;-)
Cheers,
Arnau


More information about the torqueusers mailing list