<html>
  <head>

    <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <font size="+1">Hi<font size="+1"> all,<br>
        <font size="+1">I'm trying to manage the gpu resources.&nbsp; My
          nodes file appears to be <font size="+1">correct and pbsnodes
            report t<font size="+1">hat </font>gpus are present<font
              size="+1">.&nbsp; However, when I submit <font size="+1">requesting
                gpus the job enters a deferred state.&nbsp; The queue appears
                to allow gpu<font size="+1"> use.&nbsp; </font>Any
                suggestions?<br>
                <br>
                <font size="+1">Many thanks,<br>
                  <font size="+1">Pete<br>
                    <br>
                    Torque version: 2.5.10<br>
                    Maui version: 3.3.1<br>
                    <br>
                  </font></font></font>Example below:<br>
              <br>
              <font size="+1"># pbsnodes <font size="+1">n10</font></font><br>
              n10<br>
              &nbsp;&nbsp;&nbsp;&nbsp; state = free<br>
              &nbsp;&nbsp;&nbsp;&nbsp; np = 16<br>
              &nbsp;&nbsp;&nbsp;&nbsp; properties = research,k20<br>
              &nbsp;&nbsp;&nbsp;&nbsp; ntype = cluster<br>
              &nbsp;&nbsp;&nbsp;&nbsp; status =
              rectime=1376676818,varattr=,jobs=,state=free,netload=50681542816,gres=,loadave=0.00,ncpus=16,physmem=132272332kb,availmem=139195740kb,totmem=140666252kb,idletime=5204925,nusers=0,nsessions=?
              0,sessions=? 0,uname=Linux n10
              2.6.32-279.2.1.el6.631g0000.x86_64 #1 SMP Sun Jul 22
              22:39:16 EDT 2012 x86_64,opsys=linux<br>
              &nbsp;&nbsp;&nbsp;&nbsp; gpus = 1<br>
              <br>
              set queue abaqus queue_type = Execution<br>
              set queue abaqus Priority = 20<br>
              set queue abaqus max_running = 2<br>
              set queue abaqus resources_max.nodes = 1:ppn=8:gpus=1<br>
              set queue abaqus resources_min.nodes = 1<br>
              set queue abaqus resources_default.nodes = 1:ppn=4:gpus=1<br>
              set queue abaqus resources_default.walltime = 02:00:00<br>
              set queue abaqus keep_completed = 300<br>
              set queue abaqus enabled = True<br>
              set queue abaqus started = True<br>
              #<br>
              <br>
              <br>
              <br>
              <font size="+1">When <font size="+1">s</font>ubmission
                includes:<br>
                <font size="+1">#PBS -l nodes=1:ppn=1:k20<br>
                  it runs f<font size="+1">ine.<br>
                    <br>
                    <font size="+1">When submission includes:</font><br>
                  </font>#PBS -l nodes=1:ppn=1:gpus=1:k20<br>
                  I get def<font size="+1">erred for no resources as
                    below.<br>
                    <br>
                    $ checkjob 1901[1]<br>
                    checking job 1901[1]<br>
                    <br>
                    State: Idle&nbsp; EState: Deferred<br>
                    Creds:&nbsp; user:gustafson&nbsp; group:pi&nbsp; class:abaqus&nbsp;
                    qos:DEFAULT<br>
                    WallTime: 00:00:00 of 41:16:00:00<br>
                    SubmitTime: Fri Aug 16 14:17:19<br>
                    &nbsp; (Time Queued&nbsp; Total: 00:02:09&nbsp; Eligible: 00:00:22)<br>
                    <br>
                    Total Tasks: 1<br>
                    <br>
                    Req[0]&nbsp; TaskCount: 1&nbsp; Partition: ALL<br>
                    Network: [NONE]&nbsp; Memory &gt;= 0&nbsp; Disk &gt;= 0&nbsp; Swap
                    &gt;= 0<br>
                    Opsys: [NONE]&nbsp; Arch: [NONE]&nbsp; Features: [k20][gpus=1]<br>
                    Dedicated Resources Per Task: PROCS: 1&nbsp; MEM: 100G<br>
                    <br>
                    <br>
                    IWD: [NONE]&nbsp; Executable:&nbsp; [NONE]<br>
                    Bypass: 0&nbsp; StartCount: 0<br>
                    PartitionMask: [ALL]<br>
                    Flags:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; RESTARTABLE<br>
                    <br>
                    job is deferred.&nbsp; Reason:&nbsp; NoResources&nbsp; (cannot
                    create reservation for job '1901[1]' (intital
                    reservation attempt)<br>
                    )<br>
                    Holds:&nbsp;&nbsp;&nbsp; Defer&nbsp; (hold reason:&nbsp; NoResources)<br>
                    PE:&nbsp; 11.71&nbsp; StartPriority:&nbsp; 1<br>
                    cannot select job 1901[1] for partition DEFAULT (job
                    hold active)<br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </font></font></font></font></font></font></font></font>
  </body>
</html>