<br><font size=2 face="sans-serif">Dear Torque/Maui Users,</font>
<br>
<br><font size=2 face="sans-serif">I have been running Torque only for
a few weeks, so not all parts work well. </font>
<br>
<br><font size=2 face="sans-serif">All jobs seem to think they are run
by installer@localhost.localdomain . "installer" is a loginid
that submits a job, but it is submitted to a queue on the master node called
"silvio" and I do not use localdomain at all on the beowulf cluster...</font>
<br>
<br><font size=2 face="sans-serif">The issue is how to define localhost.localdomain
in torque queues? hostname returns silvio, dnsdomainname returns
nothing as it should.</font>
<br>
<br><font size=2 face="sans-serif">Jobs do run but they only run on the
FIRST node in the nodes lists (only) - admittedly only one or two jobs
at time and that node can run at least 4..</font>
<br><font size=2 face="sans-serif">If I setup maui, it fails immediately
since localhost.localdomain is not an authorized node.. I'd like
to move up to maui but I have to fix the localdomain issue first.</font>
<br>
<br><font size=2 face="sans-serif">pbsnodes -a </font>
<br><font size=2 face="sans-serif">node07</font>
<br><font size=2 face="sans-serif"> state = free</font>
<br><font size=2 face="sans-serif"> np = 4</font>
<br><font size=2 face="sans-serif"> properties = d1950</font>
<br><font size=2 face="sans-serif"> ntype = cluster</font>
<br><font size=2 face="sans-serif"> jobs = 0/52.localhost.localdomain
<-------???? localhost.localdomain
: should be " jobs = 0/52.silvio " ??</font>
<br><font size=2 face="sans-serif"> status = opsys=linux,uname=Linux
node07 2.6.9-42.ELsmp #1...</font>
<br><font size=2 face="sans-serif">...</font>
<br><font size=2 face="sans-serif">silvio</font>
<br><font size=2 face="sans-serif"> state = free</font>
<br><font size=2 face="sans-serif"> np = 2</font>
<br><font size=2 face="sans-serif"> ntype = time-shared</font>
<br><font size=2 face="sans-serif"> status = opsys=linux,uname=Linux
silvio 2.6.9-42.0.3.ELsmp #1..</font>
<br>
<br><font size=2 face="sans-serif">With the master node seeming to know
that its name is "silvio" - and in the beowulf cluster there
is no DNS domain definition. Good ol' fashioned /etc/hosts names </font>
<br><font size=2 face="sans-serif">#</font>
<br><font size=2 face="sans-serif">127.0.0.1 localhost.localdomain
localhost</font>
<br><font size=2 face="sans-serif">192.168.5.11 kickstart</font>
<br><font size=2 face="sans-serif">192.168.5.99 silvio silvio.sh.rohmhaas.com</font>
<br><font size=2 face="sans-serif">192.168.5.107 node07 node7</font>
<br><font size=2 face="sans-serif">192.168.5.108 node08 node8</font>
<br><font size=2 face="sans-serif">...</font>
<br>
<br><font size=2 face="sans-serif">NB - the silvio.sh.rohmhaas.com is for
the other ethernet card to allow remote access to the cluster master.</font>
<br><font size=2 face="sans-serif">------<br>
Sincerely,<br>
<br>
Tom Pierce<br>
</font>