<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
On 04/11/2011 08:05 PM, Martin Siegert wrote:
<blockquote cite="mid:20110412020500.GA30020@stikine.sfu.ca"
type="cite">
<pre wrap="">Hi,
On Thu, Apr 07, 2011 at 05:05:04PM -0600, Ken Nielson wrote:
</pre>
<blockquote type="cite">
<pre wrap="">There is a new snapshot for 2.5.6 available. This fixes a problem with
a patch for Bugzilla 116 where the new resource procct was added. If the
-l nodes option was not used in a job submission then the job would not
be run by Moab because procct was added to the Resource_List attribute
and treated like a generic resource by Moab. Because the generic resource
procct does not exist Moab never schedules the job.
This is now fixed.
You can download this snapshot at <a class="moz-txt-link-freetext" href="http://www.clusterresources.com/downloads/torque/snapshots/torque-2.5.6-snap.201104071657.tar.gz">http://www.clusterresources.com/downloads/torque/snapshots/torque-2.5.6-snap.201104071657.tar.gz</a>
Please download and let us know if you find any problems.
</pre>
</blockquote>
<pre wrap="">
I am afraid this does not work: I haven't traced this back to the
source routine, but apparently this new version presets the nodes
resource to 1, correct?
Thus, if a user only requests -l procs=N, with 2.5.6-snap.201104071657
procct is set to N+1, not N, see
resc_def_all.c, line 1118:
ppct->rs_value.at_val.at_long =
count_proc(pnodesp->rs_value.at_val.at_str)
+ pprocsp->rs_value.at_val.at_long;
torque-2.5.6-snap.201104041023 actually worked flawlessly for me.
Which means that I haven't figured out how to trigger the bug that
torque-2.5.6-snap.201104071657 was supposed to fix.
Regardless of whether I specified -l nodes=... or -l procs=... or
neither moab always started my job, i.e., the procct resource
always got removed before the job was sent to moab, see,
svr_jobfunc.c, line 1965:
if (strcmp(pque->qu_attr->at_val.at_str, "Execution") == 0)
{
/* job routed to Execution queue successfully */
/* unset job's procct resource */
resource_def *pctdef;
resource *pctresc;
pctdef = find_resc_def(svr_resc_def, "procct", svr_resc_size);
if ((pctresc = find_resc_entry(&pjob->ji_wattr[JOB_ATR_resource], pctdef)) != NULL)
pctdef->rs_free(&pctresc->rs_value);
}
}
If somebody can explain to me how to submit a job that is not caught in
this if block, I may be able to fix this.
Cheers,
Martin
</pre>
</blockquote>
Martin,<br>
<br>
Thanks for reporting this. I will check it out and fix it.<br>
<br>
Ken<br>
<br>
<div class="moz-signature">-- <br>
<br>
<a href="http://www.adaptivecomputing.com/news/moabcon.php"><img
src="cid:part1.07060902.01080706@adaptivecomputing.com"
border="0"></a><br>
</div>
</body>
</html>