[Mauiusers] MOAB standing reservations broken ?
csamuel at vpac.org
Mon Jan 3 22:32:54 MST 2005
Can we have a Bugzilla option for Moab please ? :-)
Before Newtonmas/Xmas I noticed that our standing reservation of two nodes
that had used to work well for jobs less than 15 minutes walltime and 4 cpus
or less was no longer being used.
SRCFG[sque] STARTTIME=08:00:00 ENDTIME=20:00:00
SRCFG[sque] PERIOD=DAY DAYS=MON,TUE,WED,THU,FRI
SRCFG[sque] PROCLIMIT<=4 DEPTH=7
SRCFG[sque] TASKCOUNT=2 FLAGS=SPACEFLEX
For some reason although eligible jobs were being granted access (as confirmed
with checkjob -r) they would not run on those nodes unless forced to start
with runjob. These jobs would run on other nodes if they were spare, or on
those in the reservation once it expired at 8pm!
If I specifically asked for that reservation by doing:
qsub -q sque -W x=FLAGS:ADVRES:sque.0.98 -I
(the sque gives a default walltime of 10 minutes) then the jobs would never
start unless I did a runjob on them.
Because this effectively meant that two nodes were doing zip whilst the
reservation was in force I removed it for the holidays as we had (and still
have) a big queue of jobs waiting to go, so there's not going to be any nodes
free for this to come back into play for a couple of days (unless we get
lucky and a job finishes early tonight). Then I may be able to increase the
logging and see if there's anything there that shows any problems.
This happens with: moab-4.2.0p3-snap.1103305235
It also happened with earlier ones..
Christopher Samuel - (03)9925 4751 - VPAC Systems & Network Admin
Victorian Partnership for Advanced Computing http://www.vpac.org/
Bldg 91, 110 Victoria Street, Carlton South, VIC 3053, Australia
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: not available
Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20050104/380d4f32/attachment.bin
More information about the mauiusers