Hi<br>I cleared the job using momctl -c 12361, it gave<br>>>job clear request successful on localhost<br>But job not deleted.<br><div>then I got the following message when i ran momctl -h<br></div><br><div>root@galaxy
:~# momctl -h node07 -d 1</div><br>Host: <a href="http://node07.cluster2.iitb.ac.in/node07.cluster2.iitb.ac.in">node07.cluster2.iitb.ac.in/node07.cluster2.iitb.ac.in</a> Version: 2.1.0p0<br>Server[0]: <a href="http://192.168.1.1">
192.168.1.1</a> (connection is active)<br> Init Msgs Received: 0 hellos/1 cluster-addrs<br> Init Msgs Sent: 1 hellos<br> Last Msg From Server: 1 seconds (StatusJob)<br> Last Msg To Server: 30 seconds
<br>HomeDirectory: /usr/spool/PBS/mom_priv<br>MOM active: 360269 seconds<br>Server Update Interval: 45 seconds<br>LOGLEVEL: 0 (use SIGUSR1/SIGUSR2 to adjust)<br>Communication Model: RPP
<br>TCP Timeout: 20 seconds<br>NOTE: no prolog configured<br>Trusted Client List: <a href="http://192.168.1.106">192.168.1.106</a>,<a href="http://192.168.1.105">192.168.1.105</a>,<a href="http://192.168.1.104">
192.168.1.104</a>,<a href="http://192.168.1.103">192.168.1.103</a>,<a href="http://192.168.1.102">192.168.1.102</a>,<a href="http://192.168.1.101">192.168.1.101</a>,<a href="http://192.168.1.1">192.168.1.1</a>,<a href="http://192.168.1.116">
192.168.1.116</a>,<a href="http://192.168.1.115">192.168.1.115</a>,<a href="http://192.168.1.114">192.168.1.114</a>,<a href="http://192.168.1.113">192.168.1.113</a>,<a href="http://192.168.1.112">192.168.1.112</a>,<a href="http://192.168.1.111">
192.168.1.111</a>,<a href="http://192.168.1.110">192.168.1.110</a>,<a href="http://192.168.1.109">192.168.1.109</a>,<a href="http://192.168.1.108">192.168.1.108</a>,<a href="http://192.168.1.107">192.168.1.107</a>,<a href="http://127.0.0.1">
127.0.0.1</a><br>Configured to use /usr/bin/scp<br>job[<a href="http://12361.galaxy.aero.iitb.ac.in">12361.galaxy.aero.iitb.ac.in</a>] state=EXITING sidlist=2820<br>job[<a href="http://12851.galaxy.aero.iitb.ac.in">12851.galaxy.aero.iitb.ac.in
</a>] state=RUNNING sidlist=5667<br><div>Assigned CPU Count: 2</div><br><div>diagnostics complete<br><br>Velan<br></div><br><br><div><span class="gmail_quote">On 4/21/07, <b class="gmail_sendername">Chris Samuel</b>
<<a href="mailto:csamuel@vpac.org">csamuel@vpac.org</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div>On Sat, 21 Apr 2007, Vadivelan Ranjith wrote:
</div><br>> Hi<br><div>> Still problem is not solved. Jobs are not deleting.</div><br><div>Did the momctl command to clear that job say anything when you ran it ?</div><br>With both yourself and Adam having the same problem with jobs not getting
<br>deleted after a node reboot it's looking like it could possibly be a Torque<br><div>bug. :-(</div><br><div>Out of interest, on your compute nodes is SE Linux turned on ?</div><br>> But really job 12361 is not running. Our compute nodes are dual
<br>> processors, but currently only one processor is running because of this<br><div>> problem.</div><br><div>What does "momctl -h node07 -d 1" say ?</div><br>cheers,<br>Chris<br>--<br> Christopher Samuel - (03)9925 4751 - VPAC Deputy Systems Manager
<br> Victorian Partnership for Advanced Computing <a href="http://www.vpac.org/">http://www.vpac.org/</a><br><div> Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia</div><br>_______________________________________________
<br>torqueusers mailing list<br><a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br><a href="http://www.supercluster.org/mailman/listinfo/torqueusers">http://www.supercluster.org/mailman/listinfo/torqueusers
</a><br></blockquote></div><br>