<div dir="ltr">
        
        
        


<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">Hello Guys,</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm"><span style="font-family:arial,helvetica,sans-serif">I have a SGI ICEX
Cluster System running torque perfectly and now I&#39;m responsible to
implement torque on a SGI UV2000 System using NUMA configuration on
SLES 11, but I&#39;m having some trouble. I hope somebody can help me.</span><br></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
   
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>* Hardware
specification:</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">The topology command
says:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">System
type: UV2000</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">System name: lanina</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Serial number:
UV2-00000003</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Partition number: 0</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>48 Blades</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>1536 CPUs</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>96 Nodes</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">2933.29 GB Memory
Total</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">31.00 GB Max Memory
on any Node</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        1 BASE I/O Riser</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        2 PCIe Slots</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        2 Fibre Channel
Controllers</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        1 InfiniBand Controller</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        2 Network Controllers</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        2 Storage Controllers</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        2 USB Controllers</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">        1 VGA GPU</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm"><font face="arial, helvetica, sans-serif"><br></font></p><p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Despite the command
topology saying that there are 1536 CPUs available, there are only
768 with Hyper-Thread enabled.</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm"><font face="arial, helvetica, sans-serif"><br></font></p><p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>* Compiler step:</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">        I created the rpm
packages editing the torque.spec and including <b>--enable-numa-support</b>
and <b>--enable-cpuset</b> flags</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">After installing the
packages on the system, I could verify the correct flags with
pbs_server --about command:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif"><br>
</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>lanina:~
# pbs_server --about</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Package:     torque
4.2.3.1</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Sourcedir:  
/usr/src/packages/BUILD/torque-4.2.3.1</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Configure:   
&#39;--host=x86_64-suse-linux-gnu&#39; &#39;--build=x86_64-suse-linux-gnu&#39;
&#39;--target=x86_64-suse-linux&#39; &#39;--program-prefix=&#39; &#39;--prefix=/usr&#39;
&#39;--exec-prefix=/usr&#39; &#39;--bindir=/usr/bin&#39; &#39;--sbindir=/usr/sbin&#39;
&#39;--sysconfdir=/etc&#39; &#39;--datadir=/usr/share&#39;
&#39;--includedir=/usr/include&#39; &#39;--libdir=/usr/lib64&#39;
&#39;--libexecdir=/usr/lib64&#39; &#39;--localstatedir=/var&#39;
&#39;--sharedstatedir=/usr/com&#39; &#39;--mandir=/usr/share/man&#39;
&#39;--infodir=/usr/share/info&#39; &#39;--includedir=/usr/include/torque&#39;
&#39;--with-default-server=lanina&#39; &#39;--with-server-home=/var/spool/torque&#39;
&#39;--without-debug&#39; &#39;CFLAGS=-O0 -g3&#39; &#39;--disable-libcpuset&#39;
&#39;--with-sendmail=/usr/sbin/sendmail&#39;<b> &#39;--enable-numa-support&#39;
</b>&#39;--enable-memacct&#39; &#39;--disable-top-tempdir-only&#39;
&#39;--disable-dependency-tracking&#39; &#39;--disable-gui&#39; &#39;--without-tcl&#39;
&#39;--with-rcp=scp&#39; &#39;--enable-syslog&#39; &#39;--disable-gcc-warnings&#39;
&#39;--disable-munge-auth&#39; &#39;--without-pam&#39; &#39;--disable-drmaa&#39;
&#39;--disable-qsub-keep-override&#39; &#39;--disable-blcr&#39; <b>&#39;--enable-cpuset&#39;
&#39;--enable-spool&#39; &#39;--with-hwloc-path=/usr/include/hwloc&#39;</b>
&#39;build_alias=x86_64-suse-linux-gnu&#39;
&#39;host_alias=x86_64-suse-linux-gnu&#39; &#39;target_alias=x86_64-suse-linux&#39;
&#39;CXXFLAGS=-O2 -g -m64 -fmessage-length=0 -D_FORTIFY_SOURCE=2
-fstack-protector -funwind-tables -fasynchronous-unwind-tables&#39; 
</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <b><font face="arial, helvetica, sans-serif">Buildcflags: -O0
-g3 -DNUMA_SUPPORT -I/usr/include/hwloc/include</font></b></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>* Configuration:</b></font></p>
<p lang="en-US" align="JUSTIFY" style="text-indent:0.98cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">Following the manual, i
used the /sys/devices/system/node directory to create the
/var/spool/mom_priv/mom.layout file on the client side:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
        
</p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=0-7      mem=0</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=8-15     mem=1</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=16-23    mem=2</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=24-31    mem=3</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=32-39    mem=4</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">...</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=744-751  mem=93</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=752-759  mem=94</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">cpus=760-767  mem=95</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">On the server side I
created the /var/spool/torque/server_priv/nodes file with the
following content:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>lanina np=768
num_numa_nodes=96</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif"><b>* Results:</b></font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="text-indent:0.98cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">The log while starting
server on debug mode shows:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;Log;Log opened</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0006;PBS_Server.588556;Svr;PBS_Server;Server &#39;lanina&#39;┬ástarted, initialization type = 1</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;get_default_threads;Defaulting
min_threads to 3073 threads</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;Act;Account file
/var/spool/torque/server_priv/accounting/20130913 opened</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0040;PBS_Server.588556;Req;setup_nodes;setup_nodes()</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0040;PBS_Server.588556;Req;setup_nodes;could not create node
&quot;lanina&quot;, error = 15002</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0086;PBS_Server.588556;Svr;PBS_Server;Recovered queue batch</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0086;PBS_Server.588556;Svr;PBS_Server;Recovered queue
pesquisa</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0086;PBS_Server.588556;Svr;PBS_Server;Recovered queue
operacional</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;PBS_Server;Expected 3, recovered
3 queues</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0080;PBS_Server.588556;Svr;PBS_Server;2 total files read
from disk</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;PBS_Server;handle_job_recovery:3</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0006;PBS_Server.588556;Svr;PBS_Server;Using ports
Server:15001  Scheduler:15004  MOM:15002 (server:
&#39;lanina&#39;)</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:44;0002;PBS_Server.588556;Svr;PBS_Server;Server Ready, pid =
588556, loglevel=0</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013
10:55:55;0002;PBS_Server.588560;Svr;PBS_Server;Torque Server Version
= 4.2.3.1, loglevel = 0</font></p>
<p lang="en-US" align="LEFT" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif"><br>
</font></p>
<p lang="en-US" align="JUSTIFY" style="text-indent:0.98cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">The log while starting
client on debug mode shows:</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
     
</p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
     
</p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013 10:55:49;0002;
  pbs_mom.588562;Svr;Log;Log opened</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013 10:55:49;0002;
  pbs_mom.588562;Svr;pbs_mom;Torque Mom Version = 4.2.3.1, loglevel =
0</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013 10:55:50;0002;
  pbs_mom.588562;Svr;setup_program_environment;machine topology
contains 96 memory nodes, 1536 cpus</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013 10:55:50;0001;
  pbs_mom.588562;Svr;pbs_mom;LOG_ERROR::read_layout_file, nodeboard 0
has no nodeset</font></p>
<p lang="en-US" align="LEFT" style="margin-left:1.25cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">09/13/2013 10:55:50;0001;
  pbs_mom.588562;Svr;pbs_mom;LOG_ERROR::setup_nodeboards, Could not
read layout file!</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
<font face="arial, helvetica, sans-serif"><br>
</font></p>
<p lang="en-US" align="JUSTIFY" style="text-indent:0.98cm;margin-bottom:0cm">
<font face="arial, helvetica, sans-serif">It seems like the torque
isn&#39;t able to find the mom.layout file, but starting it using strace
program I can see torque client opening and reading the file.</font></p>
<p lang="en-US" align="JUSTIFY" style="margin-bottom:0cm">
    
</p>
<p lang="en-US" align="JUSTIFY" style="text-indent:0.9cm;margin-bottom:0cm">
    <font face="arial, helvetica, sans-serif">Any help? Thanks in advance.</font></p><div><font face="arial, helvetica, sans-serif"><br></font></div><font face="arial, helvetica, sans-serif">-- <br></font><div dir="ltr">
<font face="arial, helvetica, sans-serif">Att.<br>
MSc. Alison Barros da Silva</font><div><br></div></div>
</div>