[torquedev] nodes, procs, tpn and ncpus
Martin Siegert
siegert at sfu.ca
Thu Jun 10 18:32:11 MDT 2010
On Thu, Jun 10, 2010 at 01:36:56PM -0600, Ken Nielson wrote:
> On 06/10/2010 12:27 PM, Martin Siegert wrote:
> >
> > That is not a solution. If we not set EXACTNODE, then users who need
> > nodes=N:ppn=1 (in its very meaning, namely exactly one processor per
> > node) cannot be satisfied. And if we do set EXACTNODE, there is no way
> > (other than procs) to request N processors anywhere. This is the reason
> > why procs was introduced in the first place: so that we can set EXACTNODE
> > and satisfy both type of requests.
> >
> > Cheers,
> > Martin
> >
> >
> You may have seen in this discussion where Simon Toth and Glen Beane
> were indicating that nodes=x:ppn=y allocates y processors on x separate
> nodes and I was saying that it only allocates y processors on a single
> node.
>
> It ends up we were both right. It depends on what you have in your
> serverdb configuration. I have the server parameter
> resources_available.nodect set and Simon and Glen did not. Simon and
> Glen were running TORQUE's default behavior and TORQUE by default
> allocates nodes the same as if EXACTNODE were set in Moab.
>
> Moab muddies the waters by giving users the option to treat processors
> like nodes (vnodes in the case of PBS Pro). This is certainly one source
> of the confusion that exists on the meaning of different resources.
> While Moab is consistent in how it interprets the procs resource it has
> ambiguity with the nodes resource. If the JOBNODEMATCHPOLICY is not set
> (default) Moab treats processors as nodes. So -l nodes=x where x is
> greater than the physical nodes will be treated like -l procs=x provided
> TORQUE has set the available_resources.nodect parameter. By set I mean
> the nodect is greater than the number of physical nodes.
>
> After all this I just want to confirm what Martin has just written, that
> is procs exists so users can allocate a job with as many processors
> needed independent of the number of available nodes. We now just need
> TORQUE to recognize procs as well.
>
> Ken Nielson
> Adaptive Computing
just a comment: nodect used to be a parameter that was absolutely
essential in the pre-procs days when we did not set EXACTNODE:
in that configuration a nodes file with, e.g.,
n1 np=4
n2 np=4
...
n200 np=4
would only allow you to run a job with a maximum of 200 processors
(using a -l nodes=N request). You needed to set nodect=800 to allow jobs
with -l nodes=400 or so. I always regarded nodect as an ugly workaround.
If it turns out that unsetting nodect (or eliminating nodect) plus
introducing procs basically implements the EXACTNODE + procs policies
in torque, then I believe that that is an excellent solution.
Cheers,
Martin
More information about the torquedev
mailing list