<div>The output of qstat : </div><div><b><br></b></div><div><b>[mpiX@master mpi_fitting]$ qstat </b></div><div>Job id                    Name             User            Time Use S Queue</div><div>------------------------- ---------------- --------------- -------- - -----</div>
<div>46.master                 mpi_fitting      mpiX            00:00:00 R batch   </div><div><br></div><div><br></div><meta http-equiv="content-type" content="text/html; charset=utf-8"><span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13.3333px; color: rgb(136, 136, 136); "><div id="gt-res-content" class="almost_half_cell" style="padding-top: 9px; padding-right: 16px; ">
<div dir="ltr" style="zoom: 1; "><span id="result_box" class="" style="display: block; color: rgb(0, 0, 0); "><span title="" style="color: rgb(0, 0, 0); "><span class="Apple-style-span" style="background-color: rgb(255, 255, 255);"><span class="Apple-style-span" style="font-size: large;">I will ask permission from the administrator to view syslog (/var/log/messages)</span></span></span></span></div>
</div><div id="gt-res-tools" class="g-section" style="width: 686px; vertical-align: top; display: inline-block; zoom: 1; margin-top: 8px; "><div id="gt-res-listen" tabindex="0" class="gt-icon-c" style="color: rgb(17, 17, 204); text-decoration: none; cursor: pointer; float: left; margin-right: 1em; outline-style: none; ">
</div></div></span><div><br></div><div><br><div class="gmail_quote">On Tue, Sep 28, 2010 at 10:04 AM, Ken Nielson <span dir="ltr">&lt;<a href="mailto:knielson@adaptivecomputing.com">knielson@adaptivecomputing.com</a>&gt;</span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">


  

<div bgcolor="#ffffff" text="#000000"><div><div></div><div class="h5">
On 09/28/2010 08:57 AM, Abraham Zamudio wrote:
</div></div><blockquote type="cite"><div><div></div><div class="h5">
  
  <font color="#eeffe2" face="arial, sans-serif"><span style="border-collapse:collapse"><font color="#000000"><span style="font-size:large">Hi everybody , </span></font></span></font>
  <div><font color="#eeffe2" face="arial, sans-serif"><span style="border-collapse:collapse"><font color="#000000"><span style="font-size:large"><br>
  </span></font></span></font></div>
  <div><font color="#eeffe2" face="arial, sans-serif" size="4"><span style="border-collapse:collapse;font-size:15px"><font color="#000000"><span style="border-collapse:separate;font-size:13.3333px;color:rgb(136, 136, 136)">
  <div style="padding-top:9px;padding-right:16px">
  <div dir="ltr"><span style="display:block"><span title=""><span style="background-color:rgb(255, 255, 255)"><font color="#000000"><span style="font-size:large">I have a problem with one of my nodes : </span></font></span></span></span><span style="display:block"><span title=""><span style="background-color:rgb(255, 255, 255)"><font color="#000000"><span style="font-size:large"><br>

  </span></font></span></span></span><span style="display:block"><span title=""><span style="background-color:rgb(255, 255, 255)"><font color="#000000"><span style="font-size:large"><br>
  </span></font></span></span></span><span style="display:block"><span title=""><span style="background-color:rgb(255, 255, 255)"><font color="#000000"><span style="font-size:large"><span style="display:block"><span style="font-size:x-small"><b>[mpiX@quad2 ~]$ cat
/var/spool/torque/mom_logs/20100928 | grep 46.master</b></span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0008;   pbs_mom;Job;46.master;JOIN JOB as node 1</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0001;   pbs_mom;Job;46.master;task not started, &#39;/bin/sh&#39;,
stdio setup failed (see syslog)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0008;   pbs_mom;Job;46.master;ERROR:    received request
&#39;SPAWN_TASK&#39; from <a href="http://10.10.10.3:1023" target="_blank">10.10.10.3:1023</a> for job &#39;46.master&#39;
(cannot start task)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0001;   pbs_mom;Job;46.master;task not started, &#39;/bin/sh&#39;,
stdio setup failed (see syslog)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0008;   pbs_mom;Job;46.master;ERROR:    received request
&#39;SPAWN_TASK&#39; from <a href="http://10.10.10.3:1023" target="_blank">10.10.10.3:1023</a> for job &#39;46.master&#39;
(cannot start task)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0001;   pbs_mom;Job;46.master;task not started, &#39;/bin/sh&#39;,
stdio setup failed (see syslog)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0008;   pbs_mom;Job;46.master;ERROR:    received request
&#39;SPAWN_TASK&#39; from <a href="http://10.10.10.3:1023" target="_blank">10.10.10.3:1023</a> for job &#39;46.master&#39;
(cannot start task)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0001;   pbs_mom;Job;46.master;task not started, &#39;/bin/sh&#39;,
stdio setup failed (see syslog)</span></span><span style="display:block"><span style="font-size:x-small">09/28/2010
09:29:29;0008;   pbs_mom;Job;46.master;ERROR:    received request
&#39;SPAWN_TASK&#39; from <a href="http://10.10.10.3:1023" target="_blank">10.10.10.3:1023</a> for job &#39;46.master&#39;
(cannot start task)</span></span>
  <div><br>
  </div>
  <div>The status of job is active </div>
  <div><br>
  </div>
  <div>
  <div><span style="font-size:x-small"><b>[mpiX@master
mpi_fitting]$ showq</b></span></div>
  <div><span style="font-size:x-small">ACTIVE
JOBS--------------------</span></div>
  <div><span style="font-size:x-small">JOBNAME
           USERNAME      STATE  PROC   REMAINING            STARTTIME</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">46  
                  mpiX    Running    12    00:35:52  Tue Sep 28 09:32:56</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">    
1 Active Job       12 of   12 Processors Active (100.00%)</span></div>
  <div><span style="font-size:x-small">    
                    2 of    2 Nodes Active      (100.00%)</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">IDLE
JOBS----------------------</span></div>
  <div><span style="font-size:x-small">JOBNAME
           USERNAME      STATE  PROC     WCLIMIT            QUEUETIME</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">0
Idle Jobs</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">BLOCKED
JOBS----------------</span></div>
  <div><span style="font-size:x-small">JOBNAME
           USERNAME      STATE  PROC     WCLIMIT            QUEUETIME</span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small"><br>
  </span></div>
  <div><span style="font-size:x-small">Total
Jobs: 1   Active Jobs: 1   Idle Jobs: 0   Blocked Jobs: 0</span></div>
  <div><br>
  </div>
  <div>The same software (mpich2+gsl) run on a single node of 8
cores, This problem occurs when two nodes use . </div>
  <span style="color:rgb(136, 136, 136)">
  <div style="width:686px;vertical-align:top;display:inline-block;margin-top:8px"></div>
  </span></div>
  <div><br>
  </div>
  </span></font></span></span></span></div>
  </div>
  <div style="width:686px;vertical-align:top;display:inline-block;margin-top:8px">
  <div style="color:rgb(17, 17, 204);text-decoration:none;float:left;margin-right:1em;outline-style:none"></div>
  </div>
  </span></font></span></font></div>
  <div><font color="#eeffe2" face="arial, sans-serif" size="4"><span style="border-collapse:collapse;font-size:15px"><font color="#000000"><br>
  </font></span></font></div>
  <div><font color="#eeffe2" face="arial, sans-serif" size="4"><span style="border-collapse:collapse;font-size:15px"><font color="#000000"><br>
  </font></span></font>-- <br>
Abraham Zamudio Ch.<br>
  <br>
  </div>
  </div></div><pre><fieldset></fieldset>
_______________________________________________
torqueusers mailing list
<a href="mailto:torqueusers@supercluster.org" target="_blank">torqueusers@supercluster.org</a>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a>
  </pre>
</blockquote>
What does qstat show? Did you look at syslog?<br>
<br>
Ken Nielson<br>
Adaptive Computing<br>
</div>

<br>_______________________________________________<br>
torqueusers mailing list<br>
<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
<br></blockquote></div><br><br clear="all"><br>-- <br>Abraham Zamudio Ch.<br><br>
</div>