Hi everybody,<br><br>I'm administrating a small cluster and installed torque & maui recently. The jobs can be submitted from a head node and on all slave nodes nis&nfs&ssh is working nicely. Just now I discovered a strange problem: With some users I can submit jobs but with others submitting jobs failes with:
<br><br>************* Failure *************<br>qsub -I<br>qsub: waiting for job <a href="http://4772.wap.physik.uni-kl.de">4772.wap.physik.uni-kl.de</a> to start<br>qsub: job <a href="http://4772.wap.physik.uni-kl.de">4772.wap.physik.uni-kl.de
</a> apparently deleted<br>************* Failure *************<br><br>There are still enough nodes free...<br><br>************* maui.cfg *************<br>SERVERHOST wap<br>ADMIN1 root<br>RMCFG[base] TYPE=PBS
<br>RMPOLLINTERVAL 00:00:30<br>SERVERPORT 42559<br>SERVERMODE NORMAL<br>LOGFILE maui.log<br>LOGFILEMAXSIZE 10000000<br>LOGLEVEL 3<br>QUEUETIMEWEIGHT 1 <br>
BACKFILLPOLICY FIRSTFIT<br>RESERVATIONPOLICY CURRENTHIGHEST<br>NODEALLOCATIONPOLICY MINRESOURCE<br>GROUPCFG[DEFAULT] MAXPROC=56 MAXMEM=4096 MAXNODES=14<br>USERCFG[DEFAULT] MAXPROC=50<br>*************
maui.cfg *************<br><br>There are different users and groups configured in NIS. All of them can log in nicely everywhere. But for some _specific_ users submitting doesn't work. <br><br>I've already tried to restart the head/nodes (NIS update etc) and looked at /var/log/... without finding anything special.
<br><br>What could be a point to look at?<br><br>Best regards: <br> Gerolf<br clear="all"><br>-- <br>Dipl. Phys. Gerolf Ziegenhain <br>Office: Room 46-332 - Erwin-Schrödinger-Str.46 - TU Kaiserslautern - Germany<br>Web:
<a href="http://gerolf.ziegenhain.com">gerolf.ziegenhain.com</a><br><br>