[Mauiusers] Requesting a given number of processors
Jan Ploski
Jan.Ploski at offis.de
Fri Sep 28 10:03:38 MDT 2007
"Toni L. Harbaugh-Blackford [Contr]" <harbaugh at ncifcrf.gov> schrieb am
09/28/2007 04:14:45 PM:
> On Fri, 28 Sep 2007, Jan Ploski wrote:
>
> > "Toni L. Harbaugh-Blackford [Contr]" <harbaugh at ncifcrf.gov> schrieb
am
> > 09/28/2007 02:35:50 PM:
> >
> > >
> > > Jan-
> > >
> > > Depending on how you have things configured, you may be able to
use
> > > '-l ncpus=100'.
> >
> > Toni,
> >
> > Thanks for the tip. Unfortunately it doesn't work in our cluster. (I
am
> > the administrator, so if you happen to know any options that
> influence it,
> > please share.) I think ncpus=100 makes it look for a machine with
100
> > processors. I get 'rejected : CPU' lines in the output of checkjob
-v.
> >
> > It's amazing that these trivial matters seem not to be documented
> > anywhere.
>
> It may be something in your configuration, although I have never tried
> this asking for 100 cpus, only 8, and was able to get it to work.
Ok, I'm unable to get it working with 9, so it's not about the big number.
> Are you using "nodes=", either in your torque queue configurations or
> on the qsub command line?
I'm not using nodes= in the queue configuration. When I use nodes=9 on the
command line I get one error (see my latest message). When I use ncpus=9
on the command line, then I get another error (see my previous message).
> If you submit a job and it stays queued, what does the qstat -f look
like?
I suppose you only wish to see it for the job which is not running, not
for all jobs? Here it goes, for the ncpus=9 variant:
jploski at srvgrid01:~/torque> qstat -f 346784.srvgrid01
Job Id: 346784.srvgrid01.offis.uni-oldenburg.de
Job_Name = jpl1.jb
Job_Owner = jploski at srvgrid01.offis.uni-oldenburg.de
job_state = Q
queue = verylong
server = srvgrid01.offis.uni-oldenburg.de
Checkpoint = u
ctime = Fri Sep 28 18:00:18 2007
Error_Path = srvgrid01:/home/jploski/torque/jpl1.ERR
Hold_Types = n
Join_Path = n
Keep_Files = n
Mail_Points = n
mtime = Fri Sep 28 18:00:18 2007
Output_Path = srvgrid01:/home/jploski/torque/jpl1.OUT
Priority = 0
qtime = Fri Sep 28 18:00:18 2007
Rerunable = True
Resource_List.ncpus = 9
Resource_List.nodect = 176
Variable_List = PBS_O_HOME=/home/jploski,PBS_O_LANG=en_US,
PBS_O_LOGNAME=jploski,
PBS_O_PATH=/home/jploski/bin:/home/jploski/bin:/usr/local/bin:/usr/bi
n:/usr/X11R6/bin:/bin:/usr/games:/opt/gnome/bin:/opt/kde3/bin:/opt/ofe
d-1.1/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:/opt/ofed-1.1/sbin:/opt/p
gi/linux86-64/6.2/bin:/opt/pgi/linux86-64/6.2/mpi/mpich/bin:/opt/ncl-m
etno:/opt/netcdf-3.6.1-pgcc/bin:/opt/ncl/bin:/opt/condor/bin:/opt/cond
or/sbin:/opt/globus-install//bin:/opt/globus-install//sbin:/opt/mpiexe
c/bin:/opt/ncview-1.93c/bin:/opt/nco/bin:/opt/cdo/bin:/opt/bashdb/bin,
PBS_O_MAIL=/var/mail/jploski,PBS_O_SHELL=/bin/bash,
PBS_O_HOST=srvgrid01.offis.uni-oldenburg.de,
PBS_O_WORKDIR=/home/jploski/torque,PBS_O_QUEUE=verylong
etime = Fri Sep 28 18:00:18 2007
Best regards,
Jan Ploski
More information about the mauiusers
mailing list