Ok Gus and everyone. Thanks again for your answers.<br><br><div class="gmail_quote">There is no pbs_sched on /etc/init.d but it is here:<br><br>/usr/local/src/torque-2.3.6/contrib/init.d/pbs_sched<br>/usr/local/src/torque-2.3.6/tpackages/server/opt/pbs/sbin/pbs_sched<br>
/usr/local/src/torque-2.3.6/src/scheduler.cc/.libs/pbs_sched<br>/usr/local/src/torque-2.3.6/src/scheduler.cc/pbs_sched<br>/opt/pbs/sbin/pbs_sched<br><br>I was thinking copying /opt/pbs/sbin/pbs_sched to /etc/init.d. Is it right to do that?<br>
<br>Sorry about the &quot;manually&quot; word. It is local slang I guess. What I mean is that I went to the /opt/pbs/sbin/ folder and executed ./pbs_sched<br><br>hostname output is:<br><br>rufian.perrera.local<br><br>hosts file contain:<br>
<br># Do not remove the following line, or various programs<br># that require network functionality will fail.<br>#127.0.0.1              localhost.localdomain localhost    &lt;--------------------------Is this wrong?<br>
::1             localhost6.localdomain6 localhost6<br>127.0.0.1 rufian.perrera.local rufian<br>192.168.2.6 auyin.perrera.local auyin<br>192.168.2.4 pelusa.perrera.local pelusa<br>192.168.2.2 lamparita.perrera.local lamparita<br>
<br><br>network content is:<br><br>NETWORKING=yes<br>HOSTNAME=rufian.perrera.local<br>DOMAINNAME=perrera.local<br><br>I dont have /etc/sysconfig/pbs_server nor /etc/sysconfig/pbs_sched either <br><br><br>2009/5/21 Gus Correa <span dir="ltr">&lt;<a href="mailto:gus@ldeo.columbia.edu">gus@ldeo.columbia.edu</a>&gt;</span><br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="im">Samir Gartner wrote:<br>
&gt; Ok, scheduling wasn&#39;t enabled,now it is,<br>
<br>
</div>It happens very often.<br>
Fixing it is a good first step.<br>
<div class="im"><br>
&gt; but pbs_sched service was not<br>
&gt; found.<br>
<br>
</div>Starting up daemons in YDog may be different from RHEL, CentOS, Fedora,<br>
so I am just guessing based on the latter. Not familiar to YDog.<br>
Anyway ...<br>
<br>
Don&#39;t know if you got Torque from ClusterResources or other.<br>
In any case, there should be a pbs_sched script on /etc/init.d<br>
If it is there, do &quot;chkconfig --add pbs_sched&quot; (or YDog equivalent),<br>
then do &quot;chkconfig --list pbs_sched&quot; to see which runlevels it will be<br>
on, then &quot;service pbs_sched start&quot; to start it, or if YDog doesn&#39;t have<br>
&quot;service&quot;, run it with &quot;/etc/init.d/pbs_sched start&quot;.<br>
<br>
If you don&#39;t have the pbs_sched script in /etc/init.d, you may find one<br>
in the contrib subdirectory of the Torque source tree.<br>
Copy it over to /etc/init.d, and do the above.<br>
(The location may be other than /etc/init.d in YDog.)<br>
<div class="im"><br>
<br>
&gt; I didn&#39;t install maui, it is a default installation. About hosts<br>
&gt; file, it is properly configured as well as nodes and mom&#39;s config files.<br>
&gt;<br>
<br>
</div>You only need Maui if you want a complex scheduling policy.<br>
pbs_sched is FIFO, very simple, but works fine.<br>
I&#39;ve used it for a long time without problems.<br>
<div class="im"><br>
&gt; when I manually start pbs_sched it says<br>
&gt;<br>
&gt; pbs_sched: addclient, host localhost not found<br>
&gt;<br>
<br>
</div>Hmm ... never got this one, not that I remember.<br>
Not sure what you mean by &quot;manually start pbs_sched&quot;.<br>
Anyway, sounds as another, different, problem.<br>
<br>
<br>
Is it possible that your &quot;hostname&quot; command<br>
is not resolving your server name to rufian.perrera.local but to<br>
localhost?<br>
What is the output of &quot;hostname&quot;?<br>
What do you have in /etc/hosts?<br>
What do you have in /etc/sysconfig/network?<br>
<br>
Just in case you have  /etc/sysconfig/pbs_server and<br>
/etc/sysconfig/pbs_sched, what is the contents?<br>
(I don&#39;t have them.)<br>
<br>
(Again just guessing, YDog may have different files to startup things.)<br>
<div class="im"><br>
I hope this helps,<br>
Gus Correa<br>
---------------------------------------------------------------------<br>
Gustavo Correa<br>
Lamont-Doherty Earth Observatory - Columbia University<br>
Palisades, NY, 10964-8000 - USA<br>
---------------------------------------------------------------------<br>
<br>
&gt;<br>
</div>&gt; 2009/5/21 Samir Gartner &lt;<a href="mailto:jigzat@gmail.com">jigzat@gmail.com</a> &lt;mailto:<a href="mailto:jigzat@gmail.com">jigzat@gmail.com</a>&gt;&gt;<br>
<div class="im">&gt;<br>
&gt;     I think I&#39;m gonna cry.... I love you guys!! No, seriously, it worked<br>
&gt;     but only if executed under root user, now the question is what did I<br>
&gt;     do wrong? Jobs should start automatically, right?<br>
&gt;<br>
&gt;     I was following first the Globus tootlikt tutorial but it is kinda<br>
&gt;     outdated so I guess I issued some wrong instructions.<br>
&gt;<br>
&gt;     On of the weird things was that the tutorial suggested using the<br>
&gt;     /opt/pbs prefix when executing configure and now I have under<br>
&gt;     /opt/pbs again a /opt/pbs folder with repeated bin and sbin folders<br>
&gt;     and executables. Is this wrong or is how it is supposed to be?<br>
&gt;<br>
</div>&gt;     2009/5/21 Ling C. Ho &lt;<a href="mailto:ling@fnal.gov">ling@fnal.gov</a> &lt;mailto:<a href="mailto:ling@fnal.gov">ling@fnal.gov</a>&gt;&gt;<br>
<div class="im">&gt;<br>
&gt;         Have you configured a scheduler?<br>
&gt;<br>
&gt;         What if you use qrun. Would any job starts?<br>
&gt;<br>
&gt;         ...<br>
&gt;         ling<br>
&gt;<br>
&gt;         Samir Gartner wrote:<br>
&gt;<br>
&gt;             Ok, I don&#39;t see any file named default_server but<br>
&gt;             server_name has the right server name rufian.perrera.local<br>
&gt;             and there is another file with the same content named<br>
&gt;             server_name.new.<br>
&gt;<br>
&gt;             Righ now the PSB server name apears to be correct (after<br>
&gt;             stoping the server and manually deletting the zombie jobs)<br>
&gt;             but stil the jobs won&#39;t start.<br>
&gt;<br>
&gt;<br>
&gt;             [samir@rufian ~]$ echo &quot;sleep 30;date&quot; | /opt/pbs/bin/qsub<br>
&gt;             [samir@rufian ~]$ /opt/pbs/bin/qstat -a<br>
&gt;<br>
&gt;             rufian.perrera.local:<br>
&gt;<br>
&gt;                         Req&#39;d  Req&#39;d   Elap<br>
&gt;             Job ID               Username Queue    Jobname<br>
&gt;              SessID NDS   TSK Memory Time  S Time<br>
&gt;             -------------------- -------- -------- ----------------<br>
&gt;             ------ ----- --- ------ ----- - -----<br>
&gt;             13.rufian.perrer     samir    batch    STDIN<br>
&gt;             --      1  --    --  01:00 Q   --<br>
&gt;             [samir@rufian ~]$<br>
&gt;<br>
&gt;<br>
&gt;             by the way, is it top posting allowed??<br>
&gt;<br>
&gt;             2009/5/21 Jerry Smith &lt;<a href="mailto:jdsmit@sandia.gov">jdsmit@sandia.gov</a><br>
</div>&gt;             &lt;mailto:<a href="mailto:jdsmit@sandia.gov">jdsmit@sandia.gov</a>&gt; &lt;mailto:<a href="mailto:jdsmit@sandia.gov">jdsmit@sandia.gov</a><br>
<div><div></div><div class="h5">&gt;             &lt;mailto:<a href="mailto:jdsmit@sandia.gov">jdsmit@sandia.gov</a>&gt;&gt;&gt;<br>
&gt;<br>
&gt;<br>
&gt;                Samir,<br>
&gt;<br>
&gt;                What do you have in $PBS_HOME/{server_name,default_server}?<br>
&gt;<br>
&gt;                It should be what resolves as the ethernet address that<br>
&gt;             pbs should<br>
&gt;                be listening on.<br>
&gt;<br>
&gt;                --Jerry<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;                Samir Gartner wrote:<br>
&gt;<br>
&gt;                    Ok I finally installed torque under yellowdog/ppc but<br>
&gt;             now I have<br>
&gt;                    another problem. I set up my pbs server as<br>
&gt;             rufian.perrera.local<br>
&gt;                    but when I issue a job it shows itself in<br>
&gt;             localhost.localdomain<br>
&gt;                    and it stays on queued state forever. And if i try to<br>
&gt;             qdel the<br>
&gt;                    job it cant reach the server and the conection times<br>
&gt;             out. Any<br>
&gt;                    ideas of what could be wrong?<br>
&gt;                    I&#39;m not trying to set up anything complicated, is<br>
&gt;             just one<br>
&gt;                    machine that works as server and client.<br>
&gt;<br>
&gt;                    this is the shell output<br>
&gt;<br>
&gt;                    [root@rufian bin]# /opt/pbs/bin/qstat -a<br>
&gt;<br>
&gt;                    rufian.perrera.local:<br>
&gt;<br>
&gt;                                      Req&#39;d  Req&#39;d   Elap<br>
&gt;                    Job ID               Username Queue    Jobname<br>
&gt;                SessID<br>
&gt;                    NDS   TSK Memory Time  S Time<br>
&gt;                    -------------------- -------- --------<br>
&gt;             ---------------- ------<br>
&gt;                    ----- --- ------ ----- - -----<br>
&gt;                    7.localhost.loca     samir    batch    STDIN<br>
&gt;                   --             1  --    --  01:00 Q   --<br>
&gt;                    8.localhost.loca     samir    batch    STDIN<br>
&gt;                   --             1  --    --  01:00 Q   --<br>
&gt;                    9.localhost.loca     samir    batch    STDIN<br>
&gt;                   --             1  --    --  01:00 Q   --<br>
&gt;                    10.localhost.loc     samir    batch    STDIN<br>
&gt;                   --             1  --    --  01:00 Q   --<br>
&gt;                    [root@rufian bin]# /opt/pbs/bin/qdel<br>
&gt;             7.localhost.localdomain<br>
&gt;                    Connection timed out<br>
&gt;                    qdel: cannot connect to server localhost.localdomain<br>
&gt;             (errno=110)<br>
&gt;                    Connection timed out<br>
&gt;                    You have new mail in /var/spool/mail/root<br>
&gt;                    [root@rufian bin]# /opt/pbs/bin/qdel<br>
&gt;             7.rufian.perrera.local<br>
&gt;                    qdel: Unknown Job Id 7.rufian.perrera.local<br>
&gt;                    [root@rufian bin]# su - samir<br>
&gt;                    [samir@rufian ~]$ /opt/pbs/bin/qdel<br>
&gt;             7.localhost.localdomain<br>
&gt;                    Connection timed out<br>
&gt;                    qdel: cannot connect to server localhost.localdomain<br>
&gt;             (errno=110)<br>
&gt;                    Connection timed out<br>
&gt;                    [samir@rufian ~]$<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;             ------------------------------------------------------------------------<br>
&gt;<br>
&gt;             _______________________________________________<br>
&gt;             torqueusers mailing list<br>
&gt;             <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
</div></div>&gt;             &lt;mailto:<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a>&gt;<br>
<div><div></div><div class="h5">&gt;             <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt;<br>
&gt; ------------------------------------------------------------------------<br>
&gt;<br>
&gt; _______________________________________________<br>
&gt; torqueusers mailing list<br>
&gt; <a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
&gt; <a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
<br>
_______________________________________________<br>
torqueusers mailing list<br>
<a href="mailto:torqueusers@supercluster.org">torqueusers@supercluster.org</a><br>
<a href="http://www.supercluster.org/mailman/listinfo/torqueusers" target="_blank">http://www.supercluster.org/mailman/listinfo/torqueusers</a><br>
</div></div></blockquote></div><br>