[torquedev] memory leaks in torque-server 2.5.11: question
l.flis at cyf-kr.edu.pl
Tue Jul 3 10:21:32 MDT 2012
We are running quite a medium computing site in Poland.
Daily we process around 25k jobs - grid workloads and multi node jobs
We are facing the problem with long running pbs_server process which
after one week or two consumes all the memory available on the machine.
As a result pbs_server is unable to spawn subprocess to unmunge credentials:
06/26/2012 15:58:20;0080;PBS_Server;Req;req_reject;Reject reply
code=15012(PBS_Server System error: Inappropriate ioctl for device
MSG=couldn't create pipe to unmunge), aux=0,
type=AlternateUserAuthentication, from qcg-comp at someserver
allocate memory (12) in pipe_and_read_unmunge, Unable to popen command
I took the core dump of a process nearing to 4GB of RSS and VIRT memory.
My question is how can I determine which part of server is leaking
memory from the core file?
More information about the torquedev