<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 12 (filtered medium)">
<style>
<!--
/* Font Definitions */
@font-face
        {font-family:Wingdings;
        panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"\@MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
        {mso-style-priority:34;
        margin-top:0in;
        margin-right:0in;
        margin-bottom:0in;
        margin-left:.5in;
        margin-bottom:.0001pt;
        font-size:11.0pt;
        font-family:"Calibri","sans-serif";}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
/* List Definitions */
@list l0
        {mso-list-id:1310935395;
        mso-list-type:hybrid;
        mso-list-template-ids:-1658527160 1036547598 67698691 67698693 67698689 67698691 67698693 67698689 67698691 67698693;}
@list l0:level1
        {mso-level-start-at:900;
        mso-level-number-format:bullet;
        mso-level-text:-;
        mso-level-tab-stop:none;
        mso-level-number-position:left;
        text-indent:-.25in;
        font-family:"Calibri","sans-serif";
        mso-fareast-font-family:"MS Mincho";}
ol
        {margin-bottom:0in;}
ul
        {margin-bottom:0in;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=WordSection1>
<p class=MsoNormal>Using torque-2.6.0-snap.201008061539 and maui-3.3, I
encountered some strange behavior when scheduling jobs where the maui scheduler
would get “hung up” on communication with the server. I
finally tracked it down to this message in the maui log file:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>INFO: starting iteration 50<o:p></o:p></p>
<p class=MsoNormal>MRMGetInfo()<o:p></o:p></p>
<p class=MsoNormal>MClusterClearUsage()<o:p></o:p></p>
<p class=MsoNormal>MRMClusterQuery()<o:p></o:p></p>
<p class=MsoNormal>MPBSClusterQuery(abc.xyz.com,RCount,SC)<o:p></o:p></p>
<p class=MsoNormal>ERROR: cannot get node info: NULL<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>The behavior observed is that after some length of time (several
minutes), finally maui is able to continue and then begins scheduling jobs
again.<o:p></o:p></p>
<p class=MsoNormal>I should mention that nscd is running on both machines, that
had solved an earlier problem. From previous Google searches I noticed a
few folks had encountered this problem, but my guess is it’s not usually
noticed as anyone with relatively long-running jobs would have no idea that the
scheduler had gotten hung up. The only way we noticed it was because we
were testing a fairly intensive set of short-running jobs that we expected to
finish soon.<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>I was able to reproduce this problem fairly regularly, so I
attached to maui with gdb and found some code that I believe is
responsible. It turns out this code is in torque’s src/lib/Libifl/pbsD_connect.c,
around line 900:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> if ((encode_DIS_ReqHdr(sock, PBS_BATCH_Disconnect,
pbs_current_user) == 0) &&<o:p></o:p></p>
<p class=MsoNormal> (DIS_tcp_wflush(sock) == 0))<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> int atime;<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> struct sigaction act;<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> struct sigaction oldact;<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> /* set alarm to break out of potentially
infinite read */<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> act.sa_handler = SIG_IGN;<o:p></o:p></p>
<p class=MsoNormal> sigemptyset(&act.sa_mask);<o:p></o:p></p>
<p class=MsoNormal> act.sa_flags = 0;<o:p></o:p></p>
<p class=MsoNormal> sigaction(SIGALRM, &act,
&oldact);<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> atime = alarm(pbs_tcp_timeout);<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> /* NOTE: alarm will break out of
blocking read even with sigaction ignored */<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> while (1)<o:p></o:p></p>
<p class=MsoNormal> {<o:p></o:p></p>
<p class=MsoNormal> /* wait for server to close
connection */<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> /* NOTE: if read of
'sock' is blocking, request below may hang forever */<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> if (read(sock, &x,
sizeof(x)) < 1)<o:p></o:p></p>
<p class=MsoNormal> break;<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> alarm(atime);<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal> sigaction(SIGALRM, &oldact, NULL);<o:p></o:p></p>
<p class=MsoNormal> }<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>close(sock);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>My understanding of this is, for
some reason the client is trying to disconnect from the server. To do so,
it expects to get a -1 on a read from the (blocking) socket to the server, i.e.
it expects the server to close it from its end. It sets a signal handler
to effect a timeout on the read. pbs_tcp_timeout was set to 9 (seconds)
when I was attached.<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>The comments suggesting that setting
SIG_IGN for the alarm handler will still result in the blocking read being
interrupted are incorrect, however. I believe this may be
implementation-specific, but it definitely is not the case on our version of
Linux (fc12). I also don’t see why it would ever be reasonable to
expect this to behave like this. A simple test program proves the point:<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <stdlib.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <stdio.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <unistd.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <stdint.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <sys/types.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <sys/stat.h><o:p></o:p></p>
<p class=MsoNormal> #include <fcntl.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>#include <signal.h><o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>void handler(int signo)<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>{<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> fprintf(stderr,
"Caught signal #%d\n", signo);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>}<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>int main(int argc_, char **argv_)<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>{<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> struct sigaction act;<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> struct sigaction oldact;<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>// act.sa_handler = SIG_IGN;<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> act.sa_handler = handler;<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>
sigemptyset(&act.sa_mask);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> act.sa_flags = 0;<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> sigaction(SIGALRM,
&act, &oldact);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> int atime = alarm(10);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> char buf[10];<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> ssize_t br = read(0, buf,
10);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> <o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'> fprintf(stderr, "Broke
out of read with br = %ld\n", br);<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>}<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>Run this program as-is and the
read from stdin will get interrupted after 10 seconds, and the read will return
-1. However, switch the comment line to use SIG_IGN and the read will
block indefinitely.<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>I don’t understand
pbs_server well enough to know why takes so long to disconnect a client, but it
is not unreasonable for there to be a very long delay there as it is not a high
priority action. However, I believe the code as written is incorrect, and
leads to schedulers like maui which use torque’s client libraries to get
hung up unreasonably. Perhaps this is also the case for pbs_sched.<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'>I made a change to our local copy
of the source where I installed an empty signal handler (i.e. “void
foo(int signo) {}”, and set act.sa_handler = foo), along with some
debugging printouts. I recompiled torque and maui, and I was able to
verify from the maui logs that the timeout now gets properly handled, and maui
was able to continue gracefully.<o:p></o:p></p>
<p class=MsoNormal style='text-indent:4.5pt'><o:p> </o:p></p>
<p class=MsoNormal>In any case, I’d like to solicit some feedback:<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span
style='mso-list:Ignore'>-<span style='font:7.0pt "Times New Roman"'>
</span></span><![endif]>Do the developers agree with my assessment of the
problem?<o:p></o:p></p>
<p class=MsoListParagraph style='text-indent:-.25in;mso-list:l0 level1 lfo1'><![if !supportLists]><span
style='mso-list:Ignore'>-<span style='font:7.0pt "Times New Roman"'>
</span></span><![endif]>If so, are there other spots in the code that need to
be fixed as well?<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
<p class=MsoNormal>Many thanks,<o:p></o:p></p>
<p class=MsoNormal>William<o:p></o:p></p>
<p class=MsoNormal><o:p> </o:p></p>
</div>
</body>
</html>