<div dir="ltr">Maui doens&#39;t support gpus request. We have solved this issue recompiling Maui with support to gpu as a general purpose consumable resources (GRES). There&#39;s a patch for this: <div><br></div><div><a href="http://www.clusterresources.com/pipermail/mauiusers/2008-August/003486.html">http://www.clusterresources.com/pipermail/mauiusers/2008-August/003486.html</a><br>

</div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">2013/8/16 Peter A. Gustafson <span dir="ltr">&lt;<a href="mailto:peter.gustafson@wmich.edu" target="_blank">peter.gustafson@wmich.edu</a>&gt;</span><br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  

    
  
  <div bgcolor="#FFFFFF" text="#000000">
    <font size="+1">Hi<font size="+1"> all,<br>
        <font size="+1">I&#39;m trying to manage the gpu resources.  My
          nodes file appears to be <font size="+1">correct and pbsnodes
            report t<font size="+1">hat </font>gpus are present<font size="+1">.  However, when I submit <font size="+1">requesting
                gpus the job enters a deferred state.  The queue appears
                to allow gpu<font size="+1"> use.  </font>Any
                suggestions?<br>
                <br>
                <font size="+1">Many thanks,<br>
                  <font size="+1">Pete<br>
                    <br>
                    Torque version: 2.5.10<br>
                    Maui version: 3.3.1<br>
                    <br>
                  </font></font></font>Example below:<br>
              <br>
              <font size="+1"># pbsnodes <font size="+1">n10</font></font><br>
              n10<br>
                   state = free<br>
                   np = 16<br>
                   properties = research,k20<br>
                   ntype = cluster<br>
                   status =
              rectime=1376676818,varattr=,jobs=,state=free,netload=50681542816,gres=,loadave=0.00,ncpus=16,physmem=132272332kb,availmem=139195740kb,totmem=140666252kb,idletime=5204925,nusers=0,nsessions=?
              0,sessions=? 0,uname=Linux n10
              2.6.32-279.2.1.el6.631g0000.x86_64 #1 SMP Sun Jul 22
              22:39:16 EDT 2012 x86_64,opsys=linux<br>
                   gpus = 1<br>
              <br>
              set queue abaqus queue_type = Execution<br>
              set queue abaqus Priority = 20<br>
              set queue abaqus max_running = 2<br>
              set queue abaqus resources_max.nodes = 1:ppn=8:gpus=1<br>
              set queue abaqus resources_min.nodes = 1<br>
              set queue abaqus resources_default.nodes = 1:ppn=4:gpus=1<br>
              set queue abaqus resources_default.walltime = 02:00:00<br>
              set queue abaqus keep_completed = 300<br>
              set queue abaqus enabled = True<br>
              set queue abaqus started = True<br>
              #<br>
              <br>
              <br>
              <br>
              <font size="+1">When <font size="+1">s</font>ubmission
                includes:<br>
                <font size="+1">#PBS -l nodes=1:ppn=1:k20<br>
                  it runs f<font size="+1">ine.<br>
                    <br>
                    <font size="+1">When submission includes:</font><br>
                  </font>#PBS -l nodes=1:ppn=1:gpus=1:k20<br>
                  I get def<font size="+1">erred for no resources as
                    below.<br>
                    <br>
                    $ checkjob 1901[1]<br>
                    checking job 1901[1]<br>
                    <br>
                    State: Idle  EState: Deferred<br>
                    Creds:  user:gustafson  group:pi  class:abaqus 
                    qos:DEFAULT<br>
                    WallTime: 00:00:00 of 41:16:00:00<br>
                    SubmitTime: Fri Aug 16 14:17:19<br>
                      (Time Queued  Total: 00:02:09  Eligible: 00:00:22)<br>
                    <br>
                    Total Tasks: 1<br>
                    <br>
                    Req[0]  TaskCount: 1  Partition: ALL<br>
                    Network: [NONE]  Memory &gt;= 0  Disk &gt;= 0  Swap
                    &gt;= 0<br>
                    Opsys: [NONE]  Arch: [NONE]  Features: [k20][gpus=1]<br>
                    Dedicated Resources Per Task: PROCS: 1  MEM: 100G<br>
                    <br>
                    <br>
                    IWD: [NONE]  Executable:  [NONE]<br>
                    Bypass: 0  StartCount: 0<br>
                    PartitionMask: [ALL]<br>
                    Flags:       RESTARTABLE<br>
                    <br>
                    job is deferred.  Reason:  NoResources  (cannot
                    create reservation for job &#39;1901[1]&#39; (intital
                    reservation attempt)<br>
                    )<br>
                    Holds:    Defer  (hold reason:  NoResources)<br>
                    PE:  11.71  StartPriority:  1<br>
                    cannot select job 1901[1] for partition DEFAULT (job
                    hold active)<br>
                    <br>
                    <br>
                    <br>
                    <br>
                  </font></font></font></font></font></font></font></font>
  </div>

<br>_______________________________________________<br>
torqueusers mailing list<br>
<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
<br></blockquote></div><br></div>