[torqueusers] Configuring torque/Maui on an SGI Altix
Michael Seymour
seymour at atmosp.physics.utoronto.ca
Mon Aug 8 12:32:44 MDT 2005
Hello,
I am new to Torque and job schedulers in general. Does anyone have
experience with configuring Torque and possibly Maui on an SGI Altix
machine? The pbs_server is running on an external linux box, and is in a
semi-functional state.
I am currently stuck at getting a submitted job to run.
pbsnodes -a returns:
altix01.atmosp.physics.utoronto.ca:
state = free
np = 1
ntype = time-shared
status = arch=linux,uname=Linux altix01 2.4.21-sgi302r24 #1 SMP Fri Oct
22 22:43:12 PDT 2004 ia64,sessions=4263 4577 4703 4784
5583,nsessions=5,nusers=3,idletime=6324,totmem=44998320kb,availmem=40222096kb,physmem=35782352kb,ncpus=16,loadave=8.50,netload=18446744073701380569,state=free,rectime=1123520233
Submitting a job returns this email:
seymour at boreas$ echo 'sleep 10' | /usr/local/torque/bin/qsub
PBS Job Id: 21.boreas
Job Name: STDIN
File stage in failed, see below.
Job will be retried later, please investigate and correct problem.
Job deleted at request of Scheduler at boreas
Job could never run
And tracejob reeturns this:
seymour at boreas$ tracejob -n 10 21
Job: 21.boreas
08/08/2005 12:57:13 L Job Deleted because it would never run
08/08/2005 12:57:13 S Job Queued at request of seymour at boreas, owner =
seymour at boreas, job name = STDIN, queue = batch
08/08/2005 12:57:13 S Job deleted at request of Scheduler at boreas
08/08/2005 12:57:13 L Not enough of the right type of nodes available
08/08/2005 12:57:13 S enqueuing into batch, state 1 hop 1
08/08/2005 12:57:13 S dequeuing from batch, state EXITING
There are no jobs currently running on the node, as it is listed as free.
Any suggestions?
In general, what should I know for defining multiple queues for the Altix
machine, with respect to server, scheduler and cpuset setup? We would like
one queue for large jobs and possibly two queues for smaller jobs. Does
anyone have a configuration that can be used for an example or point me to
a web site that contains relevant information?
Thanks,
Mike S.
--
Michael D. Seymour, Computer Support
Atmospheric Physics Group, Department of Physics, University of Toronto
60 St. George Street, Toronto, ON, Canada, M5S 1A7
Tel: 416-946-3019 Fax: 416-978-8905
seymour at atmosp.physics.utoronto.ca
More information about the torqueusers
mailing list