[torquedev] Patch for Torque MOM to set the locked memory limit
Eygene Ryabinkin
rea+maui at grid.kiae.ru
Sat Aug 18 02:57:23 MDT 2007
Brock, good day.
Fri, Aug 17, 2007 at 12:29:53PM -0400, Brock Palen wrote:
> You don't need this. (the following is credit to garrick)
> All child processes of pbs_mom inherit its limits that pbs_mom was started
> under.
> We use openmpi+ofed+torque+tm all the time, all we do is in
>
> /etc/init.d/pbs_mom
> we added:
>
> ulimit -l 1048576
That is the option I am describing in my message too and I am
aware of this hack:
> >Sure, there is the option to set the value via the init.d script
> >and jobs spawned by MOM will inherit the values. But we can do it
> >later on the per-queue basis, so the programmatic way is a bit
> >better.
However, I prefer to keep all pbs_mom-related configuration in its
configuration file. Moreover, this patch will enable the limit
only for the jobs that are started under pbs_mom, but not the mom
itself. This can be valuable if one is using SELinux-like security
enforcement mechanism to prevent MOM to accidentially lock up all
memory (for example as the result of some vulnerability).
> Before we start the pbs_mom, pbs_mom will start with these limits, and thus
> any process created by the mom (a openmpi job) will also have a limit of 1GB.
>
> Its a pain, but its simple.
To weaken the pain we can avoid hardcoding the limits to the
/etc/init.d/pbs_mom and place the relevant commands to the
/etc/sysconfig/pbs_mom. It will permit our customizations to survive
the upgrades. Nevertheless, the patch enables more fine-grained
enforcement: to set the limit only for the batch process itself.
--
Eygene Ryabinkin, RRC KI
More information about the torquedev
mailing list