<br><br><div class="gmail_quote">On Wed, Apr 4, 2012 at 9:50 AM, Gus Correa <span dir="ltr"><<a href="mailto:gus@ldeo.columbia.edu">gus@ldeo.columbia.edu</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hi David<br>
<br>
Not to hijack Steven's thread ...<br>
... but just taking a quick ride on it ... :)<br>
<br>
Does the hwloc 1.1 requirement apply only to Torque 4.0?<br>
How about the older Torque series [2.X.Y, 3.X.Y]<br>
that use cpuset?<br>
[I am in the process of building 2.4.16 with cpuset.]<br>
<br></blockquote><div><br></div><div>This only applies to 4.0 and higher.</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Thank you,<br>
Gus Correa<br>
<div class="im"><br>
On 04/04/2012 10:59 AM, David Beer wrote:<br>
> Steven,<br>
><br>
> I was supposed to add that note and I forgot - my mistake and thanks for<br>
> catching it. I have now added:<br>
><br>
> *** For admins that use cpusets in any form ***<br>
> hwloc version 1.1 or greater is now required for building TORQUE with<br>
> cpusets, as pbs_mom now uses the<br>
> hwloc API to create the cpusets instead of creating them manually.<br>
><br>
> to README.building_40.<br>
><br>
> As far as checking for the existence of the library, this does happen at<br>
> configure time once the configure script determines that the user is<br>
> going to be using cpusets in any way, which a few different configure<br>
> options can trigger.<br>
><br>
> David<br>
><br>
> On Tue, Apr 3, 2012 at 8:15 PM, DuChene, StevenX A<br>
</div><div class="im">> <<a href="mailto:stevenx.a.duchene@intel.com">stevenx.a.duchene@intel.com</a> <mailto:<a href="mailto:stevenx.a.duchene@intel.com">stevenx.a.duchene@intel.com</a>>> wrote:<br>
><br>
> I installed hwloc-1.4.1 and hwloc-devel-1.4.1 rpms on the server<br>
> where I am building torque-4.X and in looking through the output<br>
> from the configure script during the build I do not see anywhere<br>
> that the existence of any hwloc stuff is checked. In fact in<br>
> grepping through the output from the whole torque rpm build process<br>
</div>> I do not see ANY mention of hwloc at all.____<br>
><br>
> __ __<br>
<div class="im">><br>
> I see compile time flags of HWLOC_CFLAGS and HWLOC_LIBS mentioned in<br>
> the –help output from configure but according to the description<br>
> text this is just supposed to over-ride the pkg-config results<br>
> however I do not see any evidence that the pkg-config system is<br>
</div>> being quizzed at all for the existence of hwloc on the build server.____<br>
><br>
> __ __<br>
><br>
> Is there some step I am missing?____<br>
><br>
> __ __<br>
<div class="im">><br>
> I thought someone mentioned that there would be better documentation<br>
</div>> of the hwloc business in the torque-4.0.1 release?____<br>
><br>
> __ __<br>
><br>
> If so where is it?____<br>
><br>
> --____<br>
><br>
> Steven DuChene____<br>
><br>
> __ __<br>
><br>
> *From:*<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a><br>
> <mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a>><br>
> [mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a><br>
> <mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a>>] *On Behalf Of *David Beer<br>
> *Sent:* Monday, March 19, 2012 8:54 AM<br>
> *To:* Torque Users Mailing List<br>
> *Subject:* Re: [torqueusers] TORQUE 4.0 Officially Announced____<br>
><br>
> __ __<br>
><br>
> Steve,____<br>
><br>
> __ __<br>
<div class="im">><br>
> Hwloc is now required for running cpusets in TORQUE, and it helps<br>
> out a lot both in immediate use and in groundwork for future<br>
</div>> features.____<br>
><br>
> __ __<br>
<div class="im">><br>
> Immediately hwloc gives you a better cpuset because it gives you the<br>
> next core instead of the next indexed core. For example: many eight<br>
> core systems have processors 0, 2, 4, and 6 next to each other and<br>
> processors 1, 3, 5, and 7 next to each other. If you're running a<br>
> pre-4.0 TORQUE, and you have two jobs on the node, each with 4<br>
> cores, job 1 will have 0-3 and job 2 will have 4-7. In TORQUE 4.0,<br>
> job 1 will have 0, 2, 4, and 6, and job 2 will have 1, 3, 5, and 7.<br>
> This should help speed up processing times for jobs (NOTE: only if<br>
> you have this kind of system and a comparable job layout, I'm not<br>
> promising a general speed-up to everyone using cpusets). This should<br>
> also allow us to properly handle hyperthreading for anyone that has<br>
</div>> it turned on and wishes to use it.____<br>
><br>
> __ __<br>
<div class="im">><br>
> The last immediate feature is if you have SMT (simultaneous<br>
> multi-threading) hardware. The mom config variable $use_smt was<br>
> added. By default, the use of SMT is enabled, but you can tell your<br>
> pbs_mom to ignore them (not place them in the cpuset) using by<br>
</div>> adding____<br>
><br>
> __ __<br>
><br>
> $use_smt false____<br>
><br>
> __ __<br>
><br>
> to your mom config file____<br>
><br>
> __ __<br>
<div class="im">><br>
> For the future, the hwloc threads make it really easy for us to<br>
> handle hardware specific requests. One of the coming features for<br>
</div>> TORQUE is to allow requests roughly similar to:____<br>
><br>
> __ __<br>
><br>
> socket=2:numa=2 --with-hyperthreads____<br>
><br>
> __ __<br>
<div class="im">><br>
> which would say to spread the job over 2 sockets, and across the 2<br>
> numa nodes on each socket. This is a feature we plan to add to<br>
> improve support for Magny-Cours and Opteron type processors that<br>
> have multiple sockets and or multiple numa nodes on the processor<br>
> chip. Using hwloc makes it so we don't have to parse system files<br>
> and map the indices to the sockets and/or numa nodes ourselves, we<br>
> can simply use easy hwloc functions<br>
> like hwloc_get_next_obj_inside_cpuset_by_type() that allow you to<br>
> just move on to the next physical core or virtual core, or skip to<br>
</div>> the next socket or numa node as the case may be.____<br>
><br>
> __ __<br>
><br>
> David____<br>
<div class="im">><br>
> On Mon, Mar 19, 2012 at 8:47 AM, DuChene, StevenX A<br>
</div>> <<a href="mailto:stevenx.a.duchene@intel.com">stevenx.a.duchene@intel.com</a> <mailto:<a href="mailto:stevenx.a.duchene@intel.com">stevenx.a.duchene@intel.com</a>>><br>
> wrote:____<br>
<div class="im">><br>
> Also a better (more complete) explanation of what features are<br>
> enabled when hwloc is used would be helpful as well.<br>
><br>
> BTW, I built torque on my server without hwloc installed and then<br>
> installed the resulting mom packages on my nodes. The mom daemons in<br>
> that case did seem to start up just fine.<br>
> --<br>
</div>> Steven DuChene____<br>
<div class="im">><br>
><br>
> -----Original Message-----<br>
> From: <a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a><br>
> <mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a>><br>
</div><div class="im">> [mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a><br>
> <mailto:<a href="mailto:torqueusers-bounces@supercluster.org">torqueusers-bounces@supercluster.org</a>>] On Behalf Of Craig West<br>
> Sent: Sunday, March 18, 2012 10:40 PM<br>
</div><div><div class="h5">> To: Torque Users mailing list; Torque Developers mailing list____<br>
><br>
> Subject: Re: [torqueusers] TORQUE 4.0 Officially Announced<br>
><br>
><br>
> Hi Steven,<br>
><br>
> I have just begun testing Torque 4.0, as hwloc has been a long awaited<br>
> feature for me.<br>
><br>
> > It is unclear from this announcement text where hwloc has to be<br>
> installed.<br>
> > Is it just on the server or on the nodes only?<br>
><br>
> It needs to be available on the BUILD server and the nodes. I tried to<br>
> run pbs_mom on a node without the hwloc I had installed and it failed.<br>
><br>
> Note: I am running hwloc 1.4 from a directory in /usr/local<br>
> This was not automatically found by the TORQUE configure script, but you<br>
> can specify the location using HWLOC_CFLAGS & HWLOC_LIBS.<br>
> It embeds the locations that you specify in the pbs_mom (and other<br>
> files) but it seems you can set the LD_LIBRARY_PATH variable if it is<br>
> not in the same location on the BUILD server as the compute nodes.<br>
> For simplicity installing them in the same location makes sense.<br>
><br>
> > More documentation about this would be greatly appreciated.<br>
><br>
> I agree, clearer and more detailed documentation would be useful.<br>
><br>
> Cheers,<br>
> Craig.<br>
> _______________________________________________<br>
> torqueusers mailing list<br>
</div></div>> <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a> <mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a>><br>
<div class="im">> <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
> _______________________________________________<br>
> torqueusers mailing list<br>
</div>> <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a> <mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a>><br>
> <a href="http://www.supercluster.org/mailman/listinfo/torqueusers____" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers____</a><br>
><br>
><br>
><br>
> ____<br>
><br>
> __ __<br>
><br>
> -- ____<br>
><br>
> David Beer | Software Engineer____<br>
><br>
> Adaptive Computing____<br>
><br>
> __ __<br>
><br>
><br>
> _______________________________________________<br>
> torqueusers mailing list<br>
> <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a> <mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a>><br>
<div class="im HOEnZb">> <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
><br>
><br>
><br>
><br>
> --<br>
> David Beer | Software Engineer<br>
> Adaptive Computing<br>
><br>
><br>
><br>
> _______________________________________________<br>
> torqueusers mailing list<br>
> <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
> <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
<br>
</div><div class="HOEnZb"><div class="h5">_______________________________________________<br>
torqueusers mailing list<br>
<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div>David Beer | Software Engineer</div><div>Adaptive Computing</div><br>