<div dir="ltr">I should add that if I use:<div><br></div><div>#PBS -l nodes=5</div><div><br></div><div>the MPI job is run on any 5 virtual processors, perhaps all on the same physical host, which is the correct behaviour. It just falls down when I ask for >5</div>
<div><br></div><div>Andrew<br><div class="gmail_extra"><br><br><div class="gmail_quote">On 7 February 2013 09:10, Andrew Dawson <span dir="ltr"><<a href="mailto:dawson@atm.ox.ac.uk" target="_blank">dawson@atm.ox.ac.uk</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hi all,<div><br></div><div>I'm configuring a recent torque/maui installation and I'm having trouble with submitting MPI jobs. I would like for MPI jobs to specify the number of processors they require and have those come from any available physical machine, the users shouldn't need to specify processors per node etc.</div>
<div><br></div><div>The torque manual says that the nodes option is mapped to virtual processors, so for example:</div><div><br></div><div> #PBS -l nodes=8</div><div><br></div><div>should request 8 virtual processors. The problem I'm having is that our cluster currently has only 5 physical machines (nodes), and setting nodes to anything greater than 5 gives the error:</div>
<div><br></div><div> qsub: Job exceeds queue resource limits MSG=cannot locate feasible nodes (nodes file is empty or all systems are busy)<br></div><div><br></div><div>I'm confused by this, we have 33 virtual processors available across the 5 nodes (4 8-core machines and one single core) so my interpretation of the manual is that I should be able to request 8 nodes, since these should be understood as virtual processors? Am I doing something wrong?</div>
<div><br></div><div>I tried setting</div><div><br></div><div>#PBS -l procs=8</div><div><br></div><div>but that doesn't seem to do anything, MPI stops due to having only 1 worker available (single core allocated to the job).</div>
<div><br></div><div>Thanks,</div><div>Andrew</div><div><br></div><div>p.s.</div><div><br></div><div>The queue I'm submitting jobs to is defined as:</div><div><br></div><div>
<div>create queue normal</div><div>set queue normal queue_type = Execution</div><div>set queue normal resources_min.cput = 12:00:00</div><div>set queue normal resources_default.cput = 24:00:00</div><div>set queue normal disallowed_types = interactive</div>
<div>set queue normal enabled = True</div><div>set queue normal started = True</div><div><br></div></div><div>and we are using torque version 2.5.12 and we are using maui 3.3.1 for scheduling<br clear="all"><div><br></div>
</div></div>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Dr Andrew Dawson<br>Atmospheric, Oceanic & Planetary Physics<br>Clarendon Laboratory<br>Parks Road<br>Oxford OX1 3PU, UK<br>Tel: +44 (0)1865 282438<br>Email: <a href="mailto:dawson@atm.ox.ac.uk" target="_blank">dawson@atm.ox.ac.uk</a><div>
Web Site: <a href="http://www2.physics.ox.ac.uk/contacts/people/dawson" target="_blank">http://www2.physics.ox.ac.uk/contacts/people/dawson</a></div>
</div></div></div>