[Moabusers] large number of jobs queued

Justin Bronder jsbronder at gmail.com
Wed Sep 27 09:14:57 MDT 2006


On 9/27/06, Thomas Raisor <thunder at et.byu.edu> wrote:
>
> Curious,
>
> is it possible by some other policy/or would there be interest in, a
> MAXJOBPerUser parameter?


I know you can specify the following on a per class basis.  If you are using
a single class, or a remap class, this might be effective.

CLASSCFG[linux-spool] MAXJOB[USER]=1

-Justin.


When I set MAXJOB to 25000 the moab process was
> consuming a GB of RAM and was pegging the CPU. Really the large number
> of jobs came from just a few users, but newly submitted jobs from other
> users weren't even considered for execution even though much of the
> cluster was idling (due to fairness policies that won't allow single
> users to have more than a certain number of running jobs). If I could
> set a MAXJOBPERUSER that would let the scheduler consider up to a
> certain number of jobs per user, with a MAXJOB cap (as it is now), then
> I think my server would not be so overburdened.
>
> Thoughts?
>
> Tom
> --
>
> wightman wrote:
> > FYI, there are a few of these parameters that can be tweeked at
> > configuration time.  I'm not sure why they aren't documented but open up
> > configure and search on "max".
> >
> > For jobs you can configure with "--with-maxjobs=<number>".
> >
> > - Douglas
> >
> > On Tue, 2006-09-26 at 10:53 -0400, Justin Bronder wrote:
> >
> >> On 9/25/06, Thomas G. Raisor <thunder at et.byu.edu> wrote:
> >>         Hi,
> >>
> >>         I have about 25,000 jobs in my torque queue right now, but
> >>         moab is only
> >>         seeing roughly the first 4100 (using showq).  Jobs not shown
> >>         with showq
> >>         give the following error when I do a checkjob on them.
> >>
> >>         ERROR:  cannot locate job 'jobid'
> >>
> >>         I can run the jobs with qrun with no problems. This is a
> >>         vanilla install
> >>         of moab - are there defaults parameters I need to increase? I
> >>         was using
> >>         an older patch release of moab 4.5, updated to the latest and
> >>         get the
> >>         same behavior. Could it be torque not communicating its jobs
> >>         to moab?
> >>         torque version is 2.1.0 - (yes, I know I am a little out of
> >>         date, but I
> >>         haven't had any problems until now.)
> >>
> >>         Suggestions?
> >>
> >>         Tom
> >>
> >>         _______________________________________________
> >>         moabusers mailing list
> >>         moabusers at supercluster.org
> >>         http://www.supercluster.org/mailman/listinfo/moabusers
> >>
> >>
> >>
> >> It should be seeing the first 4096 jobs actually.  Anyways, you can
> >> adjust the
> >> default limits, which is something I had to do for partitions.  Refer
> >> to the
> >> following page and the MAXJOB variable.
> >>
> >> http://www.clusterresources.com/products/mwm/docs/a.ddevelopment.shtml
> >>
> >> -Justin.
> >> _______________________________________________
> >> moabusers mailing list
> >> moabusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/moabusers
> >>
> >
> >
> >
>
>
>
> --
> Tom Raisor
> Director - Fulton Supercomputing Lab
> Brigham Young University
> 801 422 4267
> tom_raisor at byu.edu
> _______________________________________________
> moabusers mailing list
> moabusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/moabusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/moabusers/attachments/20060927/a0d14487/attachment.html


More information about the moabusers mailing list