<html><body><div style="color:#000; background-color:#fff; font-family:lucida console, sans-serif;font-size:14pt"><div><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;"><span><span><br>>This sounds like you are generating more IOPS than your storage system<br>>can deliver, probably because you are doing many small random<br>>requests.</span></span></span></span></div><div><span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;"><br></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">The cluster is diskless so all IO operations </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">are done on the server. I run <br></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent;
font-style: normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">"iostat 1" on </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">the server before running the application on </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">the compute node.</span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;"> As you can see, the average </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">user cpu usage is 0%, then it goes to 23% <br></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style:
normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">and </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">then goes to 0% which means I terminate the </span></span><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">application on the node.</span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><br><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;"></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">Thing is, the read/write operations per
second is almost zero during the <br></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal; font-size: 13px; font-family: arial,helvetica,sans-serif;"><span style="font-size: 13px;"><span style="font-family: arial,helvetica,sans-serif;">application run. So I wonder why cpu user on server is 20%.<br></span></span></div><div style="color: rgb(0, 0, 0); background-color: transparent; font-style: normal;"><span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;"><br></span></span></div><div style="color: rgb(0, 0, 0); font-size: 18.6667px; font-family: lucida console,sans-serif; background-color: transparent; font-style: normal;"><span style="font-size: small;"><span style="font-family: arial,helvetica,sans-serif;">Device: tps kB_read/s kB_wrtn/s
kB_read kB_wrtn<br>sda 4.00 0.00 68.00 0 68<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 0.50 0.00 0.19 0.00 0.00 99.31<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 21.81 0.00 2.89 0.00 0.00 75.30<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 20.93 0.00 4.32 0.00 0.00 74.75<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 21.97 0.00 3.20 0.00 0.00 74.83<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 21.82 0.00 3.39 0.00 0.00 74.80<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 22.49 0.00 2.82 0.00 0.00 74.69<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 21.89 0.00 3.26 0.25 0.00 74.59<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 4.00 0.00 88.00 0 88<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 21.29 0.00 4.01 0.00 0.00 74.70<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 21.96 0.00 3.20 0.00 0.00 74.84<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 22.07 0.00 3.13 0.00 0.00 74.80<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 22.35 0.00 2.82 0.00 0.00 74.83<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 22.15 0.00 3.01 0.00 0.00 74.84<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 21.88 0.00 3.39 0.00 0.00 74.73<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 21.97 0.00 3.14 0.00 0.00 74.89<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 21.87 0.00 3.38 0.00 0.00 74.75<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 22.13 0.00 3.07 0.00 0.00 74.80<br><br>Device: tps kB_read/s kB_wrtn/s kB_read
kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00 0.00 0 0<br>sdc 0.00 0.00 0.00 0
0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00 0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu:
%user %nice %system %iowait %steal %idle<br> 0.63 0.00 0.69 0.00 0.00 98.68<br><br>Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn<br>sda 0.00 0.00 0.00 0 0<br>sdb 0.00 0.00
0.00 0 0<br>sdc 0.00 0.00 0.00 0 0<br>sdd 0.00 0.00 0.00 0 0<br>sde 0.00 0.00
0.00 0 0<br>dm-0 0.00 0.00 0.00 0 0<br><br>avg-cpu: %user %nice %system %iowait %steal %idle<br> 0.06 0.00 0.00 0.00 0.00 99.94<br><br></span></span><br><span></span></div><div> </div><div><div><font style="BACKGROUND-COLOR:#ffffff;" color="#0080ff" face="arial, helvetica, sans-serif" size="2"><span style="color:rgb(0, 0, 0);">Regards,</span><br style="color:rgb(0, 0, 0);"><span
style="color:rgb(0, 0, 0);">Mahmood</span><b><br></b></font></div></div><div><br></div> <div style="font-family: lucida console, sans-serif; font-size: 14pt;"> <div style="font-family: times new roman, new york, times, serif; font-size: 12pt;"> <div dir="ltr"> <font face="Arial" size="2"> <hr size="1"> <b><span style="font-weight:bold;">From:</span></b> Jonathan Barber <jonathan.barber@gmail.com><br> <b><span style="font-weight: bold;">To:</span></b> Mahmood Naderan <nt_mahmood@yahoo.com>; Torque Users Mailing List <torqueusers@supercluster.org> <br> <b><span style="font-weight: bold;">Sent:</span></b> Thursday, October 18, 2012 11:09 AM<br> <b><span style="font-weight: bold;">Subject:</span></b> Re: [torqueusers] low network utilization<br> </font> </div> <br>
On 17 October 2012 20:32, Mahmood Naderan <<a ymailto="mailto:nt_mahmood@yahoo.com" href="mailto:nt_mahmood@yahoo.com">nt_mahmood@yahoo.com</a>> wrote:<br>> Dear all,<br>> I have noticed that when I submit a job on a working node, the network speed<br>> is about 20Mb. That is quite slow because the switch speed is 1000Mb. That<br>> causes the processes to be in "D" state and the cpu usages are much below<br>> 100%.<br><br>This sounds like you are generating more IOPS than your storage system<br>can deliver, probably because you are doing many small random<br>requests.<br><br>You should first check that the server NIC and the switch port are<br>both running at 1GbE (using "ethtool" on the host and connecting to<br>the switch and verifying the port status).<br><br>On the NFS server (assuming linux) check the block device that<br>supports the NFS exported file system with "iostat -kx 1". If you have<br> ~100% in the "%util" column
then you are limited by the storage<br>system.<br><br>You can monitor the host network throughput with "iftop" (assuming linux).<br><br>You can get a crude idea of your baseline NFS performance by using dd<br>with large (larger than the largest amount of memory available to the<br>server and client) files and reading / writing them from the client.<br><br>For better measurements, I suggest fio:<br>http://freecode.com/projects/fio<br><br>although it is a lot more complicated to interpret the results.<br><br>Cheers<br><br>> I thought there is a problem with NFS however the stats shows about 1.3k<br>> requests per second which is not really high.<br><br>> Maybe Torque transfers data (from worker to server which has disks) quickly.<br>><br>> How can I investigate more?<br>><br>> Regards,<br>> Mahmood<br>><br>> _______________________________________________<br>> torqueusers mailing list<br>> <a
ymailto="mailto:torqueusers@supercluster.org" href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>> http://www.supercluster.org/mailman/listinfo/torqueusers<br>><br><br><br><br>-- <br>Jonathan Barber <<a ymailto="mailto:jonathan.barber@gmail.com" href="mailto:jonathan.barber@gmail.com">jonathan.barber@gmail.com</a>><br><br><br> </div> </div> </div></body></html>