[Mauiusers] Maui and Generic Resources
mamonski at man.poznan.pl
Sun Apr 10 12:51:00 MDT 2011
I also could not get GRES working (and also gpu use case, i.e. node
locked consumable resources). Eventually i found some time to dig into
the Maui sources. The starting point was this patch:
The aforementioned patch was already applied in 3.3.1, but even when i
was using -lsoftware instead of -W x="GRES.." it did not work. So i
dug further... I realized that the "x=GRES" wasn't even parsed in
Maui. There were also some missing code elsewhere. I tried to add the
missing parts (see attached patch, i was using Maui 3.3.1). It seems
to work(DISCLAIMER: i'm not a Maui developer). There is at least one
cavet, the GRES are requested per Task basics. e.g.: imagine that you
have 8 core machines with 2 GPUs, if you have application that uses:
1. one CPU core, one GPU:
qsub -W x='GRES:gpu at 1' #works
2. one CPU core, all two GPUs on one machine:
qsub -lnodes=1:ppn=1 -W x='GRES:gpu at 2' #works
3. two GPUs on two hosts
qsub -lnodes=2:ppn=1 -W x='GRES:gpu at 2' #works
4. you want all GPUs and all CPU cores on two hosts
qsub -lnodes=2:ppn=8 -W x='GRES:gpu at 1' #does not work - because the
job request 16 GPUS on two hosts, but actually if you request
exclusive access to machines you do need to specify GRES at all...
Does anyone knows what is the official process of submitting a patch?
I know there is bugzilla at the clusterresources.com but it seems to
be dedicated to Torque only.
On 23 March 2011 17:09, Mike Mosley <jmmosley at uncc.edu> wrote:
> I’ve seen several posts regarding what seems to be an inability to get Maui
> to work with Generic Resources (GRES). Does anybody have this working and
> if so what are the steps you used to configure it?
> My environment:
> Torque 2.5.5
> Maui 3.3.1
> I have a number of compute nodes which have 3 GPUs each.
> I created the following entries in maui.cfg
> NODECFG[compute1] GRES=ngpus:3
> NODECFG[compute2] GRES=ngpus:3
> etc. etc.
> I then tried submitting a job along the lines of:
> qsub -l nodes=1 -W x=”GRES:ngpus at 3” my_script
> The job gets scheduled and executed on a compute node and the ngpus
> specification is ignored. By that, I mean that I can take the resource
> definition out for compute2 and the job may still get
> scheduled there even though I’ve asked for a node with that resourse in my
> qsub command.
> mauiusers mailing list
> mauiusers at supercluster.org
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 3504 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20110410/5909a0f9/attachment.obj
More information about the mauiusers