[torqueusers] Handling of double-fork-and-kill detached processes
Ian Stokes-Rees
i.stokes-rees1 at physics.ox.ac.uk
Wed Mar 2 08:57:49 MST 2005
[I just sent this to the GridEngine mailing list, but am also interested
in how Torque (and PBS, if anyone knows) handles this situation]
Hi,
How does Torque deal with processes which detach from their parent
process via the common "double fork and kill" technique? I'm just
wondering if it is possible for users to start a process which then
sticks around even when the original process group has been killed. We
seem to be having this problem with a current cluster and are wondering
if Torque does anything "auto-magically" to catch these processes and
kill them.
Our first idea was to kill all processes by the particular user on that
node once their job finished, but then we realised that it might be a
dual (or quad) CPU node, or there may be process overloading, so the
same user may have more than one legitimate job running at the same time
within the same process space, so killing everything by them would be a
no-no -- the other job would be killed too.
Thanks for suggestions regarding how this is handled in Torque.
Cheers,
Ian.
--
Ian Stokes-Rees i.stokes-rees at physics.ox.ac.uk
Particle Physics, Oxford http://www-pnp.physics.ox.ac.uk/~stokes
More information about the torqueusers
mailing list