<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <div class="moz-cite-prefix"><br>
      We are using libc 2.5.<br>
      <br>
      The stack size for our host is set to 1600000 KB.&nbsp; If there are 2
      threads then pbs_mom will use ~3GB then?<br>
      <br>
      stack size&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; (kbytes, -s) 1600000<br>
      <br>
      <br>
      Is there a workaround to this problem?&nbsp; If not, upgrade to 4.2.6.1
      or 4.2.7 is recommended?<br>
      <br>
      Thanks.<br>
      <br>
      Steven.<br>
      <br>
      <br>
      On 12/06/2013 10:55 AM, Ken Nielson wrote:<br>
    </div>
    <blockquote
cite="mid:CADvLK3f-VX0wE1Nwd5CSNLcx_1gFrzqrSCEu19OWFEW5Ssp88w@mail.gmail.com"
      type="cite">
      <div dir="ltr">
        <div>My apologies.&nbsp; I have been doing more testing and it
          appears libc 2.18 still has the bug.&nbsp; I will be reporting this
          the GNU and hopefully they will have a fix soon.<br>
          <br>
        </div>
        Again my apologies for jumping the gun.<br>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Fri, Dec 6, 2013 at 11:23 AM, Ken
          Nielson <span dir="ltr">&lt;<a moz-do-not-send="true"
              href="mailto:knielson@adaptivecomputing.com"
              target="_blank">knielson@adaptivecomputing.com</a>&gt;</span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div dir="ltr">
              <div>
                <div>
                  <div>It is nice to have confirmation on what we found
                    about the pthread_attr_setstacksize. As you said,
                    the default stack size of a new thread is the value
                    of ulimit -s * 1000. Currently the TORQUE code tries
                    to make a minimum stack size but it does not check
                    for a maximum. So if you have your stack size set to
                    300,000 you will get 300,000,000 bytes allocated for
                    each thread. On the mom there are two threads so you
                    get 600 Mb allocated. On the server it gets huge. <br>
                    <br>
                    The bug in libc is that the
                    pthread_attr_setstacksize is ignored. My development
                    box is running libc version 2.15 and the bug is
                    still there. However, I installed libc 2.18 on a
                    CentOS 6 box and the bug appears to be fixed.&nbsp; I am
                    going to modify the MOM code to also set a maximum
                    stack size.<br>
                  </div>
                  <br>
                </div>
                The TORQUE fix will show up in 4.2.7. <br>
                <br>
              </div>
              Regards<br>
            </div>
            <div class="gmail_extra">
              <div>
                <div class="h5"><br>
                  <br>
                  <div class="gmail_quote">On Fri, Dec 6, 2013 at 10:52
                    AM, Steven Lo <span dir="ltr">&lt;<a
                        moz-do-not-send="true"
                        href="mailto:slo@cacr.caltech.edu"
                        target="_blank">slo@cacr.caltech.edu</a>&gt;</span>
                    wrote:<br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      <div bgcolor="#FFFFFF" text="#000000">
                        <div><br>
                          Hi David,<br>
                          <br>
                          The nodes which we observed are running the
                          following version:<br>
                          <br>
                          -bash-3.2# ldd /opt/torque/sbin/pbs_mom | grep
                          libc.so<br>
                          &nbsp;&nbsp;&nbsp; libc.so.6 =&gt; /lib64/libc.so.6
                          (0x00002b18eae2a000)<br>
                          <br>
                          -bash-3.2# ldd --version<br>
                          ldd (GNU libc) 2.5<br>
                          <br>
                          <br>
                          -bash-3.2# qstat --version<br>
                          Version: 4.1.5.1<br>
                          Revision: <br>
                          <br>
                          -bash-3.2# uname -a<br>
                          Linux zwicky005 2.6.18-308.1.1.el5 #1 SMP Fri
                          Feb 17 16:51:01 EST 2012 x86_64 x86_64 x86_64
                          GNU/Linux<br>
                          <br>
                          <br>
                          <br>
                          We see that it's using ~3G of memory:<br>
                          <br>
                          -bash-3.2# top -p 16695<br>
                          <br>
                          top - 09:46:45 up 81 days,&nbsp; 1:01,&nbsp; 1 user,&nbsp;
                          load average: 9.19, 9.17, 9.11<br>
                          Tasks:&nbsp;&nbsp; 1 total,&nbsp;&nbsp; 0 running,&nbsp;&nbsp; 1 sleeping,&nbsp;&nbsp;
                          0 stopped,&nbsp;&nbsp; 0 zombie<br>
                          Cpu(s): 74.6%us,&nbsp; 0.7%sy,&nbsp; 0.0%ni, 24.6%id,&nbsp;
                          0.0%wa,&nbsp; 0.0%hi,&nbsp; 0.0%si,&nbsp; 0.0%st<br>
                          Mem:&nbsp; 24675856k total, 24286304k used,&nbsp;&nbsp;
                          389552k free,&nbsp;&nbsp; 497860k buffers<br>
                          Swap: 49150856k total,&nbsp; 4750564k used,
                          44400292k free, 10798448k cached<br>
                          <br>
                          &nbsp; PID USER&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; PR&nbsp; NI&nbsp; VIRT&nbsp; RES&nbsp; SHR S %CPU
                          %MEM&nbsp;&nbsp;&nbsp; TIME+&nbsp; COMMAND&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
                          16695 root&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 15&nbsp;&nbsp; 0 3195m 3.1g 7052 S&nbsp; 0.3
                          13.1&nbsp; 77:50.71 pbs_mom&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <br>
                          <br>
                          <br>
                          We came across this posting and not sure if
                          this is relevant:<br>
                          <br>
                          <a moz-do-not-send="true"
                            href="http://comments.gmane.org/gmane.comp.clustering.torque.user/13557"
                            target="_blank">http://comments.gmane.org/gmane.comp.clustering.torque.user/13557</a><br>
                          <br>
                          <br>
                          Thanks for looking into this.<br>
                          <br>
                          Steven.<br>
                          <br>
                          <br>
                          On 12/06/2013 09:04 AM, David Beer wrote:<br>
                        </div>
                        <blockquote type="cite">
                          <div dir="ltr">The issue is that in some
                            versions of libc, the pthread stack size
                            will default to 1000 * &lt;the value set in
                            ulimit -s&gt;, even though TORQUE specifies
                            what stack size each thread should have. I
                            will work to get a list of the versions of
                            libc that have this bug. Ken is the one that
                            discovered this defect, so I'll ask him for
                            the info or ask him to post the info.</div>
                          <div class="gmail_extra"><br>
                            <br>
                            <div class="gmail_quote">On Fri, Dec 6, 2013
                              at 9:02 AM, Gus Correa <span dir="ltr">&lt;<a
                                  moz-do-not-send="true"
                                  href="mailto:gus@ldeo.columbia.edu"
                                  target="_blank">gus@ldeo.columbia.edu</a>&gt;</span>
                              wrote:<br>
                              <blockquote class="gmail_quote"
                                style="margin:0 0 0 .8ex;border-left:1px
                                #ccc solid;padding-left:1ex">David<br>
                                <br>
                                For the benefit of all Torque users,<br>
                                could you please disclose all
                                combinations of libc versions<br>
                                and Torque versions that have this
                                problem?<br>
                                <br>
                                Thank you,<br>
                                Gus Correa<br>
                                <div><br>
                                  On 12/05/2013 08:52 PM, David Beer
                                  wrote:<br>
                                  &gt; Steven,<br>
                                  &gt;<br>
                                  &gt; What OS and version of the
                                  pthread library (libc) do you have? We
                                  know<br>
                                  &gt; of a rather large memory leak
                                  related to different versions these
                                  libraries.<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt; On Thu, Dec 5, 2013 at 12:01 PM,
                                  Steven Lo &lt;<a
                                    moz-do-not-send="true"
                                    href="mailto:slo@cacr.caltech.edu"
                                    target="_blank">slo@cacr.caltech.edu</a><br>
                                </div>
                                <div>&gt; &lt;mailto:<a
                                    moz-do-not-send="true"
                                    href="mailto:slo@cacr.caltech.edu"
                                    target="_blank">slo@cacr.caltech.edu</a>&gt;&gt;

                                  wrote:<br>
                                  &gt;<br>
                                  &gt;<br>
                                  &gt; &nbsp; &nbsp; Hi,<br>
                                  &gt;<br>
                                  &gt; &nbsp; &nbsp; We've discovered that pbs_mom
                                  on most nodes are using over 3GB of<br>
                                  &gt; &nbsp; &nbsp; memory.<br>
                                  &gt; &nbsp; &nbsp; Is there a known memory leak
                                  issue for version 4.1.5.1? &nbsp;If so, is
                                  there<br>
                                  &gt; &nbsp; &nbsp; a patch for<br>
                                  &gt; &nbsp; &nbsp; it or we have to upgrade to
                                  other version like 4.1.7 or 4.2.6.1?<br>
                                  &gt;<br>
                                  &gt; &nbsp; &nbsp; Thanks in advance for your
                                  suggestion.<br>
                                  &gt;<br>
                                  &gt; &nbsp; &nbsp; Steven.<br>
                                  &gt;<br>
                                  &gt; &nbsp; &nbsp;
                                  _______________________________________________<br>
                                  &gt; &nbsp; &nbsp; torqueusers mailing list<br>
                                </div>
                                &gt; &nbsp; &nbsp; <a moz-do-not-send="true"
                                  href="mailto:torqueusers@supercluster.org"
                                  target="_blank">torqueusers@supercluster.org</a>
                                &lt;mailto:<a moz-do-not-send="true"
                                  href="mailto:torqueusers@supercluster.org"
                                  target="_blank">torqueusers@supercluster.org</a>&gt;<br>
                                <div>
                                  <div>&gt; &nbsp; &nbsp; <a
                                      moz-do-not-send="true"
                                      href="http://www.supercluster.org/mailman/listinfo/torqueusers"
                                      target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
                                    &gt;<br>
                                    &gt;<br>
                                    &gt;<br>
                                    &gt;<br>
                                    &gt; --<br>
                                    &gt; David Beer | Senior Software
                                    Engineer<br>
                                    &gt; Adaptive Computing<br>
                                    &gt;<br>
                                    &gt;<br>
                                    &gt;
                                    _______________________________________________<br>
                                    &gt; torqueusers mailing list<br>
                                    &gt; <a moz-do-not-send="true"
                                      href="mailto:torqueusers@supercluster.org"
                                      target="_blank">torqueusers@supercluster.org</a><br>
                                    &gt; <a moz-do-not-send="true"
                                      href="http://www.supercluster.org/mailman/listinfo/torqueusers"
                                      target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
                                    <br>
_______________________________________________<br>
                                    torqueusers mailing list<br>
                                    <a moz-do-not-send="true"
                                      href="mailto:torqueusers@supercluster.org"
                                      target="_blank">torqueusers@supercluster.org</a><br>
                                    <a moz-do-not-send="true"
                                      href="http://www.supercluster.org/mailman/listinfo/torqueusers"
                                      target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
                                  </div>
                                </div>
                              </blockquote>
                            </div>
                            <br>
                            <br clear="all">
                            <span><font color="#888888">
                                <div><br>
                                </div>
                                -- <br>
                                <div>David Beer | Senior Software
                                  Engineer</div>
                                <div>Adaptive Computing</div>
                              </font></span></div>
                          <span><font color="#888888"> <br>
                              <fieldset></fieldset>
                              <br>
                              <pre>_______________________________________________
torqueusers mailing list
<a moz-do-not-send="true" href="mailto:torqueusers@supercluster.org" target="_blank">torqueusers@supercluster.org</a>
<a moz-do-not-send="true" href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a>
</pre>
                            </font></span></blockquote>
                        <br>
                      </div>
                      <br>
                      _______________________________________________<br>
                      torqueusers mailing list<br>
                      <a moz-do-not-send="true"
                        href="mailto:torqueusers@supercluster.org"
                        target="_blank">torqueusers@supercluster.org</a><br>
                      <a moz-do-not-send="true"
                        href="http://www.supercluster.org/mailman/listinfo/torqueusers"
                        target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
                      <br>
                    </blockquote>
                  </div>
                  <br>
                  <br clear="all">
                  <br>
                  -- <br>
                </div>
              </div>
              <span class="HOEnZb"><font color="#888888">Ken Nielson<br>
                  <a moz-do-not-send="true"
                    href="tel:%2B1%20801.717.3700" value="+18017173700"
                    target="_blank">+1 801.717.3700</a> office <a
                    moz-do-not-send="true"
                    href="tel:%2B1%20801.717.3738" value="+18017173738"
                    target="_blank">+1 801.717.3738</a> fax<br>
                  1712 S. East Bay Blvd, Suite 300&nbsp; Provo, UT&nbsp; 84606<br>
                  <a moz-do-not-send="true"
                    href="http://www.adaptivecomputing.com"
                    target="_blank">www.adaptivecomputing.com</a><br>
                  <br>
                </font></span></div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        Ken Nielson<br>
        +1 801.717.3700 office +1 801.717.3738 fax<br>
        1712 S. East Bay Blvd, Suite 300&nbsp; Provo, UT&nbsp; 84606<br>
        <a moz-do-not-send="true"
          href="http://www.adaptivecomputing.com" target="_blank">www.adaptivecomputing.com</a><br>
        <br>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
torqueusers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a>
<a class="moz-txt-link-freetext" href="http://www.supercluster.org/mailman/listinfo/torqueusers">http://www.supercluster.org/mailman/listinfo/torqueusers</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>