[Mauiusers] PREEMPTPOLICY SUSPEND

Gerson Galang gerson.sapac@gawab.com
Thu, 15 Jul 2004 14:29:33 +0930


Hi Scott and Brian,

I've followed your suggestion but it didn't work for me. I even tried 
some of your suggestions in your earlier postings

http://supercluster.org/pipermail/mauiusers/2004-June/001208.html
http://supercluster.org/pipermail/mauiusers/2004-April/001112.html

but they were also of no help to me.

Here's my config:

# I've also used SUSPEND in here but nothing happened.
PREEMPTPOLICY  REQUEUE
 

USERCFG[gerson]          QDEF=ger
USERCFG[globus]          QDEF=glo
 

SRCFG[gerson]    PERIOD=INFINITY
SRCFG[gerson]    OWNER=USER:gerson
SRCFG[gerson]    FLAGS=OWNERPREEMPT
SRCFG[gerson]    HOSTLIST=ochre crimson brick
SRCFG[gerson]    QOSLIST=ger+,glo-
 

SRCFG[globus]    PERIOD=INFINITY
SRCFG[globus]    OWNER=USER:globus
SRCFG[globus]    FLAGS=OWNERPREEMPT
SRCFG[globus]    HOSTLIST=scarlet fire cherry
SRCFG[globus]    QOSLIST=ger-,glo+
 

QOSCFG[ger]       QFLAGS=PREEMPTOR
QOSCFG[ger]       QFLAGS=PREEMPTEE
QOSCFG[glo]       QFLAGS=PREEMPTOR
QOSCFG[glo]       QFLAGS=PREEMPTEE

I'm testing my configuration by letting user globus send 2 MPI jobs 
which uses 3 nodes so that all the nodes (6 of them) will all be filled 
up with jobs from user globus. When globus' jobs are already being 
processed, I start sending an MPI job (which requests for 3 nodes) using 
my account but nothing happens. My job only gets processed when all of 
globus' jobs have already finished executing.

Can you let us know what we still need to add in the configuration file 
for the suspend-resume functionality to work? I am using mpiexec0.76, 
mpich-1.2.5.2, torque1.0.1p5, and maui3.2.6p7.

Thanks,
Gerson


scott@supercluster.org wrote:
> Hello,
> 
> The SUSPENDSIG parameter indicates the signal that will be sent to the 
> program upon suspension.  23 is simply given as an example in the 
> documentation and is defined as SIGURG, an urgent condition on the 
> socket.  The default action of a program when receiving this signal is to 
> ignore it, so if you want something to happen you will have to handle the 
> signal yourself.  The following configuration worked 
> when submitting a job via "qsub -q parallel <jobname>" to preempt the 
> existing jobs and return them idle to the queue.  Note that the only 
> significant difference is in the last line where QOSLIST=high+,low-
> indicating low is allowed in the reservation but giving preference to 
> high.
> 
> MCFG[base] SUSPENDSIG=23
> 
> PREEMPTIONPOLICY SUSPEND
> 
> QOSCFG[high] PRIORITY=1000
> QOSCFG[high] QFLAGS=PREEMPTOR
> QOSCFG[low]  PRIORITY=100
> QOSCFG[low]  QFLAGS=PREEMPTEE
> 
> CLASSCFG[parallel] QDEF=high
> CLASSCFG[batch]    QDEF=low
> CLASSCFG[day]      QDEF=low
> CLASSCFG[test]     QDEF=low
> 
> # Standing reservations for "parallel class"
> SRCFG[para] PERIOD=INFINITY
> SRCFG[para] RESOURCES=PROCS:2
> SRCFG[para] HOSTLIST=cythereal
> SRCFG[para] CLASSLIST=parallel
> SRCFG[para] OWNER=QOS:high
> SRCFG[para] FLAGS=OWNERPREEMPT
> SRCFG[para] QOSLIST=high+,low-
> 
> Hope that helps,
> 
> Scott
> 
> Cluster Resources, INC.
> 
> On Fri, 2 Jul 2004 taless@lcc.ufmg.br wrote:
> 
> 
>>
>>
>>
>>Dear Maui users,
>>I am working with a cluster configuration with Rocks Clusters Toolkit,
>>which uses Maui as scheduler and Torque as resource manager. Torque
>>is 3.2.6p6s1.
>>
>>I would like to have serial jobs as PREEMPTEE and parallel jobs
>>as PREEMPTOR. The PREEMPT configuration used is QOS based and
>>the PREEMPTPOLICY is SUSPEND. As suggested in Maui Admin, the
>>SUSPENDSIG parameter of the RMCFG was defined as 23.
>>Is it possible with Torque?? I have made some tests here, without
>>results: the PREEMPTOR job cannot suspend the PREEMPTEE job.
>>May anyone help me? Thanks!
>>Tales
>>
>>Main Configuration parameters
>>
>>RMCFG[base] SUSPENDSIG=23
>>
>>PREEMPTPOLICY SUSPEND
>>
>>QOSCFG[high] PRIORITY=1000
>>QOSCFG[high] QFLAGS=PREEMPTOR
>>QOSCFG[low]  PRIORITY=100
>>QOSCFG[low]  QFLAGS=PREEMPTEE
>>
>>CLASSCFG[parallel] QDEF=high
>>CLASSCFG[large]    QDEF=low
>>CLASSCFG[day]      QDEF=low
>>CLASSCFG[test]     QDEF=low
>>
>># Standing reservations for "parallel class"
>>SRCFG[para] PERIOD=INFINITY
>>SRCFG[para] RESOURCES=PROCS:2
>>SRCFG[para]
>>HOSTLIST=comp-pvfs-0-10.local,comp-pvfs-0-9.local,comp-pvfs-0-8.local
>>SRCFG[para] CLASSLIST=parallel
>>SRCFG[para] OWNER=QOS:high
>>SRCFG[para] FLAGS=OWNERPREEMPT
>>SRCFG[para] QOSLIST=high
>>
>>
>>
>>
>>- - - -
>>Tales J. da Silva - taless@lcc.ufmg.br
>>LCC/CENAPAD - UFMG
>>www.cenapad.ufmg.br
> 
> 
> _______________________________________________
> mauiusers mailing list
> mauiusers@supercluster.org
> http://supercluster.org/mailman/listinfo/mauiusers
> 
> 

-- 
Gerson Galang
Research Programmer

South Australian Partnership for Advanced Computing
School of Physics
The University of Adelaide
Adelaide 5005
SA, AUSTRALIA

Phone:  61 8 8303 3185
Email: gerson.galang@adelaide.edu.au