[Mauiusers] Re: How to configure some limits
Rob Lines
rlinesseagate at gmail.com
Tue Mar 18 06:41:39 MDT 2008
On Thu, Mar 13, 2008 at 4:46 AM, Steve Traylen <steve.traylen at cern.ch>
wrote:
>
>
> On Mar 12, 2008, at 9:44 PM, Rob Lines wrote:
>
> > I apologize for anyone that sees this twice. I somehow missed that
> > there is a separate list for maui from the torque list.
> >
> > Hi everyone. We are new to Torque/Maui and we are still getting a
> > feel for it. We would like to put into place some limits so that
> > the cluster is more fairly shared.
> >
> > For us on our old clusters we had a limit that no one person could
> > have more than 90% of the job slots used. This allowed us to have
> > people submit thousands of jobs in a batch and let them go but still
> > left a number of slots for other people to run jobs.
> >
> > With going to Torque/Maui we are looking to do something similar
> > though as we have more nodes it would be nice to be able to adjust
> > that a bit so that if there was only one person running jobs at that
> > moment it would allow them to use all the slots but the moment
> > anyone else were to submit a job it would become the next one to be
> > run even if the first person had many more jobs waiting and that had
> > been waiting longer.
> >
>
> Have a look at the soft/hard limits here.
>
>
http://www.clusterresources.com/products/maui/docs/6.2throttlingpolicies.shtml
>
> For a say 100 job cluster.
>
> USERCFG[DEFAULT] MAXJOB=90,110
>
> should do something similar to what to you want.
>
We have 188 job slots so I added
USERCFG[DEFAULT] MAXJOB=94,190
With the goal that we would allow one person during heavy usage to only use
at most half of the processors available. The problem we have run into is
that it doesn't seem to be allowing those heavy users to take advantage of
the free slots when no one else is using them. Any suggestions on where to
look as to why it isn't?
It worked out well for when we had a couple people all trying to use the
same cluster as it pretty much shared the resource in a reasonable manner.
One person had a few hundred jobs that only took about 4 hours to complete
and another one had about 100 but his took in the neighborhood of 12
hours. It filled up to the limit with the longer jobs but the shorter jobs
were able to keep rotating in and out without a problem. today was the first
time since then that I have see only one person with jobs queued. In the
short term I upped the max jobs first number to 150 and now that user has
150 jobs running but it leaves us with 38 empty slots. Earlier he had 94
jobs running with the remainder in a blocked state.
Thanks,
Thank you,
Rob
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20080318/f42cdb11/attachment.html
More information about the mauiusers
mailing list