<div dir="ltr"><div>Martin,<br><br></div>We&#39;re sorry this wasn&#39;t caught beforehand. If you apply this patch to 4.2.6 it will work with the older moms:<br><br>4f9245b05bb0a296bbfacfca68c6807c6ddb1c39<br></div><div class="gmail_extra">
<br><br><div class="gmail_quote">On Wed, Nov 20, 2013 at 3:53 PM, Martin Siegert <span dir="ltr">&lt;<a href="mailto:siegert@sfu.ca" target="_blank">siegert@sfu.ca</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
We ran into exactly the same problem: we needed to upgrade the server<br>
rightaway because of the security hole (CVE-2013-4495). We then<br>
planned rolling updates on all computenodes. As soon as we had<br>
updated the server the moms on the computenodes died with segmentation<br>
faults :-(<br>
Because of the security hole updating the server last is not really<br>
an option.<br>
<br>
Cheers,<br>
Martin<br>
<div class="im"><br>
On Wed, Nov 20, 2013 at 03:37:40PM -0700, Ken Nielson wrote:<br>
&gt;<br>
&gt;    If you are doing rolling upgrades keep your pbs_server at the earlier<br>
&gt;    version and upgrade the pbs_moms to 4.2.6. After all of your MOMs have<br>
&gt;    upgraded then move pbs_server to 4.2.6.<br>
&gt;    This has been fixed and will be available in all upcoming TORQUE<br>
&gt;    releases.<br>
&gt;    Regards<br>
&gt;<br>
&gt;    On Wed, Nov 20, 2013 at 3:01 PM, Rick McKay<br>
</div><div class="im">&gt;    &lt;[1]<a href="mailto:rmckay@adaptivecomputing.com">rmckay@adaptivecomputing.com</a>&gt; wrote:<br>
&gt;<br>
&gt;    Eva,<br>
&gt;    That&#39;s a defect. As soon as you upgrade your MOMs to 4.2.6, they&#39;ll<br>
&gt;    start jobs. It&#39;s marked for correction in 4.2.7. Here&#39;s the changeset<br>
&gt;    hash:<br>
&gt;    345daa2..e3fb235 HEAD -&gt; 4.2-dev<br>
&gt;    Rick McKay | Technical Support Engineer<br>
&gt;    Adaptive Computing<br>
&gt;<br>
</div><div><div class="h5">&gt;    On Wed, Nov 20, 2013 at 2:54 PM, Eva Hocks &lt;[2]<a href="mailto:hocks@sdsc.edu">hocks@sdsc.edu</a>&gt; wrote:<br>
&gt;<br>
&gt;      torque server 4.2.6 cannot start jobs on moms running 4.2.5 due to<br>
&gt;      Undefined attribute  (15002) in send_job_work? Is this an expected<br>
&gt;      behavior?<br>
&gt;      11/20/2013 13:42:07;0040;PBS_Server.17972;Req;set_nodes;allocating<br>
&gt;      nodes for job 206074.mskcc-fe1.local with node expression<br>
&gt;      &#39;gpu-1-4:ppn=10&#39;<br>
&gt;      11/20/2013 13:42:07;0040;PBS_Server.17972;Req;set_nodes;job<br>
&gt;      206074.mskcc-fe1.local allocated 1 nodes<br>
&gt;      (nodelist=gpu-1-4/0+gpu-1-4/1+gpu-1-4/2+gpu-1-4/3+gpu-1-4/4+gpu-1-4/<br>
&gt;      5+gpu-1-4/6+gpu-1-4/7+gpu-1-4/8+gpu-1-4/9)<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;206074.mskcc-fe1.local;Job Run at<br>
&gt;      request of root@mskcc-fe1.local<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;svr_setjobstate;svr_setjobstate:<br>
&gt;      setting job 206074.mskcc-fe1.local state from QUEUED-QUEUED to<br>
&gt;      RUNNING-PRERUN (4-40)<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;206074.mskcc-fe1.local;send of<br>
&gt;      job to gpu-1-4 failed error = 15002<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0001;PBS_Server.17972;Svr;PBS_Server;LOG_ERROR::Undefined<br>
&gt;      attribute  (15002) in send_job_work, child failed in previous commit<br>
&gt;      request for job 206074.mskcc-fe1.local<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;206074.mskcc-fe1.local;entering<br>
&gt;      finish_sendmom<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0002;PBS_Server.17972;Job;206074.mskcc-fe1.local;child<br>
&gt;      reported failure for job after 0 seconds (dest=???), rc=-1<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;206074.mskcc-fe1.local;unable to<br>
&gt;      run job, MOM rejected/rc=-1<br>
&gt;      11/20/2013 13:42:07;0040;PBS_Server.17972;Req;free_nodes;freeing<br>
&gt;      nodes for job 206074.mskcc-fe1.local<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;svr_setjobstate;svr_setjobstate:<br>
&gt;      setting job 206074.mskcc-fe1.local state from RUNNING-TRNOUT to<br>
&gt;      QUEUED-QUEUED (1-10)<br>
&gt;      11/20/2013 13:42:07;0040;PBS_Server.17972;Req;free_nodes;freeing<br>
&gt;      nodes for job 206074.mskcc-fe1.local<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;206074.mskcc-fe1.local;unable to<br>
&gt;      run job, send to MOM &#39;183508983&#39; failed<br>
&gt;      11/20/2013<br>
&gt;      13:42:07;0008;PBS_Server.17972;Job;svr_setjobstate;svr_setjobstate:<br>
&gt;      setting job 206074.mskcc-fe1.local state from QUEUED-QUEUED to<br>
&gt;      QUEUED-QUEUED (1-10)<br>
&gt;      Thanks<br>
&gt;      Eva<br>
&gt;      _______________________________________________<br>
&gt;      torqueusers mailing list<br>
</div></div>&gt;      [3]<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt;      [4]<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
&gt;<br>
&gt;      _______________________________________________<br>
&gt;      torqueusers mailing list<br>
&gt;      [5]<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt;      [6]<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
<div class="im">&gt;<br>
&gt;    --<br>
&gt;    Ken Nielson<br>
&gt;    <a href="tel:%2B1%20801.717.3700" value="+18017173700">+1 801.717.3700</a> office <a href="tel:%2B1%20801.717.3738" value="+18017173738">+1 801.717.3738</a> fax<br>
&gt;    1712 S. East Bay Blvd, Suite 300  Provo, UT  84606<br>
</div>&gt;    [7]<a href="http://www.adaptivecomputing.com" target="_blank">www.adaptivecomputing.com</a><br>
&gt;<br>
&gt; References<br>
&gt;<br>
&gt;    1. mailto:<a href="mailto:rmckay@adaptivecomputing.com">rmckay@adaptivecomputing.com</a><br>
&gt;    2. mailto:<a href="mailto:hocks@sdsc.edu">hocks@sdsc.edu</a><br>
&gt;    3. mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt;    4. <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
&gt;    5. mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt;    6. <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
&gt;    7. <a href="http://www.adaptivecomputing.com/" target="_blank">http://www.adaptivecomputing.com/</a><br>
<div class="HOEnZb"><div class="h5"><br>
&gt; _______________________________________________<br>
&gt; torqueusers mailing list<br>
&gt; <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt; <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
_______________________________________________<br>
torqueusers mailing list<br>
<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br><div>David Beer | Senior Software Engineer</div><div>Adaptive Computing</div>
</div>