[Moabusers] spreading user priority

Gareth.Williams at csiro.au Gareth.Williams at csiro.au
Tue Feb 5 19:31:00 MST 2008


> -----Original Message-----
> From: moabusers-bounces at supercluster.org [mailto:moabusers-
> bounces at supercluster.org] On Behalf Of Williams, Gareth (HPSC,
Melbourne -
> HPSC)
> Sent: Monday, 4 February 2008 11:02 AM
> To: Lennart.Karlsson at nsc.liu.se
> Cc: moabusers at supercluster.org
> Subject: RE: [Moabusers] spreading user priority
> 
> Hi Lennart,
> 
> Thanks for your email. I think that the suggestion to have a small
> MAXIJOB could be useful for us.  We want most of the priority to be
> based on fair-share and this use of MAXIJOB with an appropriate weight
> on QUEUETIME and/or XFACTOR will effectively spread the priority of a
> given user's jobs from this baseline.
> 
> I will try this and see if there are undesirable side-effects. I don't
> know which jobs blocked by the MAXIJOB setting will become eligible
> first.
> 
> Of course the idea is not 'randomizing the queue' (though that is
> interesting!), it's systematically spreading a given users priority.
> 
> Gareth Williams

The small MAXIJOB setting does spread each users priority and tends to
interleave different users' jobs - except when the fairshare dominates.
That is all good.  However, there is a side effect in that I find that
the jobs blocked by the use of MAXIJOB get unblocked in an unpredictable
order.  I would expect and like for jobs to become unblocked according
to jobid order or according to other priority considerations.  Has
anyone experienced similar behaviour?

The behaviour I observe allows for individual jobs to remain blocked,
potentially indefinitely, while the user has enough other jobs queued
(and starting).

-- Gareth

> 
> > -----Original Message-----
> > From: moabusers-bounces at supercluster.org [mailto:moabusers-
> > bounces at supercluster.org] On Behalf Of Lennart Karlsson
> > Sent: Friday, 1 February 2008 9:35 AM
> > To: Williams, Gareth (HPSC, Melbourne - HPSC)
> > Cc: moabusers at supercluster.org
> > Subject: Re: [Moabusers] spreading user priority
> >
> > Hi Gareth,
> >
> > I like your idea of randomizing the queue, when you get bored with
> > fairness and defending your rightful place in the queue! :-)
> >
> > But it is much more simple to throw out the "fairness" algorithm
> > and lower the MAXIJOB value until you are satisfied that different
> > users are mixed enough within the queue.
> >
> > If some users again would need a higher job starting rate,
> > give them a higher MAXIJOB value than the others.
> >
> > Just my two cents,
> > -- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
> >    National Supercomputer Centre in Linkoping, Sweden
> >    http://www.nsc.liu.se
> >
> >
> > > We have a scheduling problem on our cluster with a number of users
> > > submitting ensembles of jobs which all have similar requirements
and
> are
> > > submitted all together (I expect this is pretty common).  Our jobs
> > > priority is set by a combination of fairshare and queue (+xfactor)
> time
> > > and both of these change in the same way for the whole set of
jobs,
> so
> > > one user dominates the top of the queue until the fairshare
(usage)
> > > changes enough or shorter jobs get enough xfactor advantage to
> displace
> > > longer jobs.
> > >
> > > I think that I want a mechanism to spread the priority of a given
> users
> > > jobs. I think it would have to take the order of a given users
jobs
> and
> > > then suppress the priority of all but the first job by a constant
or
> per
> > > credential configurable factor. eg. for a set of jobs with current
> > > priority:
> > >
> > > 100 99 99 98 97 97 97
> > >
> > > and a constant factor of 2, the priorities would become
> > >
> > > 100 97 95 92 89 87 85
> > >
> > > This would have to include running jobs, otherwise as soon as one
> job
> > > started, subsequent jobs would just take it's place (at least on
the
> > > next scheduling cycle) and the user-spread factor would not
> effectively
> > > do anything.
> > >
> > > Unfortunately it seems that this would require an extra pass when
> > > calculating priorities which would not be ideal for maintaining
> > > performance of moab itself.
> > >
> > > Please, do CRI and/or the community consider this as a valuable
> > > development proposal?
> > >
> > > One could do a brute-force proof-of-concept of this, by
periodically
> > > resetting the user priority of jobs (via torque's qalter), but
that
> > > would be quite heavy-weight on both torque and moab so I've been
> holding
> > > off.  Perhaps using api calls or communicating with moab rather
than
> > > torque to reset user priorities would be more efficient.  Is this
> > > possible?=20
> > >
> > > Note that we already a have in place a number of features which
> moderate
> > > this issue, including generous MAXIJOB settings, node features for
> jobs
> > > with specific requirements and standing reservations accessible by
> only
> > > shorter jobs. Also, note that parallel jobs are most badly
affected
> > > because they are rarely eligible for backfill.
> >
> >
> > _______________________________________________
> > moabusers mailing list
> > moabusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/moabusers
> 
> 
> _______________________________________________
> moabusers mailing list
> moabusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/moabusers




More information about the moabusers mailing list