[Moabusers] spreading user priority
Gareth.Williams at csiro.au
Gareth.Williams at csiro.au
Tue Jan 29 20:12:39 MST 2008
Hi,
We have a scheduling problem on our cluster with a number of users
submitting ensembles of jobs which all have similar requirements and are
submitted all together (I expect this is pretty common). Our jobs
priority is set by a combination of fairshare and queue (+xfactor) time
and both of these change in the same way for the whole set of jobs, so
one user dominates the top of the queue until the fairshare (usage)
changes enough or shorter jobs get enough xfactor advantage to displace
longer jobs.
I think that I want a mechanism to spread the priority of a given users
jobs. I think it would have to take the order of a given users jobs and
then suppress the priority of all but the first job by a constant or per
credential configurable factor. eg. for a set of jobs with current
priority:
100 99 99 98 97 97 97
and a constant factor of 2, the priorities would become
100 97 95 92 89 87 85
This would have to include running jobs, otherwise as soon as one job
started, subsequent jobs would just take it's place (at least on the
next scheduling cycle) and the user-spread factor would not effectively
do anything.
Unfortunately it seems that this would require an extra pass when
calculating priorities which would not be ideal for maintaining
performance of moab itself.
Please, do CRI and/or the community consider this as a valuable
development proposal?
One could do a brute-force proof-of-concept of this, by periodically
resetting the user priority of jobs (via torque's qalter), but that
would be quite heavy-weight on both torque and moab so I've been holding
off. Perhaps using api calls or communicating with moab rather than
torque to reset user priorities would be more efficient. Is this
possible?
thanks,
Gareth
ps.
Note that we already a have in place a number of features which moderate
this issue, including generous MAXIJOB settings, node features for jobs
with specific requirements and standing reservations accessible by only
shorter jobs. Also, note that parallel jobs are most badly affected
because they are rarely eligible for backfill.
Gareth Williams
Outreach Manager
CSIRO IM&T - eScience (HPSC)
Level 11, 700 Collins St, Docklands, VIC 3008
GPO Box 1289, Melbourne, Vic, 3001
Ph +61 3 9669 8114
http://www.hpsc.csiro.au <http://www.hpsc.csiro.au/>
http://www.csiro.au <http://www.csiro.au/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/moabusers/attachments/20080130/31c37afc/attachment.html
More information about the moabusers
mailing list