<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7652.24">
<TITLE>pbs_mom and remote logging</TITLE>
</HEAD>
<BODY>
<!-- Converted from text/rtf format -->
<P><FONT SIZE=2 FACE="Arial">Ok, I can not seem to get pbs_mom to log everything to the remote syslog server.</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">The ONLY things I see on the remote loghost are startups and shutdowns:</FONT>
<BR><FONT SIZE=2 FACE="Arial">-------------------snip-----------------</FONT>
<BR><FONT SIZE=2 FACE="Arial">Sep 26 11:02:17 n1 pbs_mom: shutdown succeeded</FONT>
<BR><FONT SIZE=2 FACE="Arial">Sep 26 11:02:18 n1 pbs_mom: pbs_mom startup succeeded</FONT>
<BR><FONT SIZE=2 FACE="Arial">-------------------snip-----------------</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Whereas the log on the node has plenty of info:</FONT>
<BR><FONT SIZE=2 FACE="Arial">-------------------snip-----------------</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0001; pbs_mom;Job;job_nodes;job: 2180.cluster0.default.domain numnodes=8 numvnod=8</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0002; pbs_mom;n/a;run_pelog;prolog script '/var/torque/mom_priv/prologue.parallel' does not exist (cwd: /var/torque/mom_priv)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0002; pbs_mom;n/a;run_pelog;userprolog script '/var/torque/mom_priv/prologue.user.parallel' does not exist (cwd: /var/torque/mom_priv)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;JOIN JOB as node 7</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;evaluating limits for job</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;do_rpp;got an internal task manager request in do_rpp</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0002; pbs_mom;Svr;im_request;connect from 192.168.0.8:1023</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;received request 'SPAWN_TASK' from 192.168.0.8:1023</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;INFO: received request 'SPAWN_TASK' from 192.168.0.8:1023 for job '2180.cluster0.default.domain' (spawning task on node '0' with taskid=9, globid='none'</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;saving task (IM_SPAWN_TASK)</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Svr;task_save;saving task in /var/torque/mom_priv/jobs/2180.cluste.TK/0000000009</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0002; pbs_mom;n/a;mom_close_poll;entered</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0001; pbs_mom;Job;2180.cluster0.default.domain;task set to running/saving task (start_process)</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Svr;task_save;saving task in /var/torque/mom_priv/jobs/2180.cluste.TK/0000000009</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:17;0008; pbs_mom;Job;2180.cluster0.default.domain;start_process: task started, tid 9, sid 3001, cmd orted</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0002; pbs_mom;n/a;cput_sum;cput_sum: session=3001 pid=3001 cputime=0 (cputfactor=1.000000)</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0008; pbs_mom;Job;scan_for_terminated;for job 2180.cluster0.default.domain, task 9, pid=3001, exitcode=0</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0008; pbs_mom;Job;2180.cluster0.default.domain;sending signal 9 to task</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0008; pbs_mom;Svr;task_save;saving task in /var/torque/mom_priv/jobs/2180.cluste.TK/0000000009</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0080; pbs_mom;Job;2180.cluster0.default.domain;scan_for_terminated: job 2180.cluster0.default.domain task 9 terminated, sid 3001</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:18;0008; pbs_mom;Svr;task_save;saving task in /var/torque/mom_priv/jobs/2180.cluste.TK/0000000009</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:20;0008; pbs_mom;Job;do_rpp;got an internal task manager request in do_rpp</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:20;0002; pbs_mom;Svr;im_request;connect from 192.168.0.8:1023</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:20;0008; pbs_mom;Job;2180.cluster0.default.domain;received request 'POLL_JOB' from 192.168.0.8:1023</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;composing status update for server</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "opsys=linux"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "uname=Linux n1 2.6.9-55.0.6.ELsmp #1 SMP Thu Aug 23 11:13:21 EDT 2007 x86_64"</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "sessions=? 15201"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "nsessions=? 15201"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "nusers=0"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "idletime=801"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;totmem;totmem: total mem=8548777984</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "totmem=8348416kb"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;availmem;availmem: free mem=7080996864</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "availmem=6915036kb"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "physmem=2050944kb"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "ncpus=2"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "loadave=1.00"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "netload=1642141202"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "state=free"</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;setting alarm in is_update_stat</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;is_update_stat: sending to server "jobs=2180.cluster0.default.domain"</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:21;0002; pbs_mom;n/a;is_update_stat;status update successfully sent to cluster0</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;do_rpp;got an internal task manager request in do_rpp</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0002; pbs_mom;Svr;im_request;connect from 192.168.0.8:1023</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;2180.cluster0.default.domain;received request 'KILL_JOB' from 192.168.0.8:1023</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0100; pbs_mom;Job;2180.cluster0.default.domain;kill_job received</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;2180.cluster0.default.domain;im_request: KILL_JOB 2180.cluster0.default.domain node 192.168.0.8:1023</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;2180.cluster0.default.domain;kill_job</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0002; pbs_mom;n/a;run_pelog;userepilog script '/var/torque/mom_priv/epilogue.precancel' does not exist (cwd: /var/torque/mom_priv)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;2180.cluster0.default.domain;kill_job done</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0002; pbs_mom;n/a;run_pelog;epilog script '/var/torque/mom_priv/epilogue.parallel' does not exist (cwd: /var/torque/mom_priv)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0002; pbs_mom;n/a;run_pelog;userepilog script '/var/torque/mom_priv/epilogue.user.parallel' does not exist (cwd: /var/torque/mom_priv)</FONT></P>
<P><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0008; pbs_mom;Job;2180.cluster0.default.domain;all tasks complete - purging job as sister</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0080; pbs_mom;Job;2180.cluster0.default.domain;removing job</FONT>
<BR><FONT SIZE=2 FACE="Arial">09/26/2007 11:05:23;0080; pbs_mom;Job;2180.cluster0.default.domain;removed job file</FONT>
<BR><FONT SIZE=2 FACE="Arial">-------------------snip-----------------</FONT>
</P>
<P><FONT SIZE=2 FACE="Arial">Is there something special to set in the mom_priv/config to ensure all messages are sent to syslog?</FONT>
</P>
<P><B><FONT COLOR="#000080" SIZE=2 FACE="Arial">Brian Andrus</FONT> <FONT COLOR="#FF0000" SIZE=2 FACE="Arial">perot</FONT><FONT SIZE=2 FACE="Arial">systems<BR>
</FONT><FONT COLOR="#000080" SIZE=2 FACE="Arial">Site Manager | Sr. Computer Scientist<BR>
Naval Research Lab<BR>
</FONT></B><FONT SIZE=2 FACE="Arial">7 Grace Hopper Ave, Monterey, CA 93943<BR>
Phone (831) 656-4839 | Fax (831) 656-4866<BR>
</FONT>
</P>
</BODY>
</HTML>