Hi<br>Thanks for reply.<br>actually /etc/pbs.conf having following :<br>----------------------------------------------------------------------------------------------------<br>velan@galaxy:/etc$ cat pbs.conf<br>PBS_EXEC=/usr/pbs<br>PBS_HOME=/var/spool/torque<br>PBS_START_SERVER=1<br>PBS_START_MOM=0<br>PBS_START_SCHED=1<br>velan@galaxy:/etc$<br>----------------------------------------------------------------------------------------------------<br>Torque was installed in /usr/local/torque and PBS_HOME directory was in two places.<br>1. /var/spool/torque (having sched_*, server_* & mom_*<br>2. /usr/spool/PBS/(having maui, sched_*, server_*...)- Here no mom directory is present.<br>I changed the /etc/pbs.conf file to
<br>----------------------------------------------------------------------------------------------------<br>PBS_EXEC=/usr/local<br>PBS_HOME=/usr/spool/PBS<br>PBS_START_SERVER=1<br>PBS_START_MOM=0<br>PBS_START_SCHED=1<br>----------------------------------------------------------------------------------------------------<br>Now i did /sbin/service pbs restart<br>----------------------------------------------------------------------------------------------------<br>root@galaxy:/etc# /sbin/service pbs restart<br>Restarting PBS<br>Stopping PBS<br>This is secondary server, killing process.<br>PBS server - was pid: 7925<br>PBS sched - was pid: 7941<br>Starting PBS<br>PBS server<br>PBS sched<br>root@galaxy:/etc# /sbin/service pbs status<br>pbs_server is pid 8210<br>pbs_sched is pid 8226<br><br>----------------------------------------------------------------------------------------------------<br><br>Now pbs_server is running. I dont know whether I did correctly. <br>Now i tried to
start maui.. but i gave the old error<br>----------------------------------------------------------------------------------------------------<br>root@galaxy:/etc# /usr/local/maui-3.2.6p16/sbin/maui restart<br>ERROR: cannot open user interface socket on port 42559<br>----------------------------------------------------------------------------------------------------<br><br>i did ps aux | grep maui<br>----------------------------------------------------------------------------------------------------<br>root@galaxy:/etc# ps aux| grep maui<br>root 6759 0.0 0.4 31712 20200 ? S 03:19 0:00 /usr/local/maui-3.2.6p16/sbin/maui start<br>root 8310 0.0 0.0 4812 648 pts/1 S+ 04:26 0:00 grep
maui<br>----------------------------------------------------------------------------------------------------<br><br>Parallel jobs are running now. But i not able to start the maui. Is it because of disabled the queue(I disabled)?. Thank you for a valuable help<br><br>Thanks<br>Velan<br><br><br><b><i>David Chin <david.w.h.chin@gmail.com></i></b> wrote:<blockquote class="replbq" style="border-left: 2px solid rgb(16, 16, 255); margin-left: 5px; padding-left: 5px;"> Looks like a few things possibly happening:<br><br>1. there may be an old pbs_server process running.<br> Check with "ps -elf | grep pbs_"<br>2. also a pbs_sched process.<br>3. does the directory /usr/spool/PBS/mom_priv<br> exist? Or did the new version put the PBS<br> directories elsewhere?<br><br>Cheers,<br> Dave<br><br>On 1/18/07, Vadivelan Ranjith <achillesvelan@yahoo.co.in> wrote:<br>> Hi Friends<br>> We used PBS and upgrade to torque-2.1.0p0. Jobs and queue everything was<br>> fine.
Two days before i stop the queue and shutdown the machine for<br>> renovation. Today i booted frondend and all compute nodes. I was in shock by<br>> seeing the error msg. I started the pbs in frondend. It gave the following<br>> msg.<br>> ----------------------------------------------------------------------------------------------------------------------<br>> root@galaxy:/usr/local# /sbin/service pbs restart<br>> Restarting PBS<br>> Stopping PBS<br>> Starting PBS<br>> PBS_Server: Resource temporarily unavailable (11) in PBS_Server, pbs_server:<br>> another server running<br>><br>> pbs_server: another server running<br>> PBS server<br>> cannot change directory to home '/usr/spool/PBS/mom_priv': No such file or<br>> directory<br>> PBS mom<br>> pbs_sched: Address already in use (98) in main, bind<br>> PBS sched<br>> root@galaxy:/usr/local#<br>>
----------------------------------------------------------------------------------------------------------------------<br>><br>><br>> And i not able to start maui also. It gave the following error.<br>> ----------------------------------------------------------------------------------------------------------------------<br>> root@galaxy:/usr/local# /usr/local/maui-3.2.6p16/sbin/maui start<br>> ERROR: cannot open user interface socket on port 42559<br>> ----------------------------------------------------------------------------------------------------------------------<br>><br>><br>><br>> I submitted the jobs(I forcely ran it). Jobs with one processor is running<br>> fine. If i give two processor it gave the following error<br>><br>> ----------------------------------------------------------------------------------------------------------------------<br>><br>> mpdboot_node02.cluster2.iitb.ac.in (handle_mpd_output
359):<br>> failed to ping mpd on node01; recvd output={}<br>><br>> mpiexec_node02.cluster2.iitb.ac.in: cannot connect to local<br>> mpd (/tmp/mpd2.console_velan); possible causes:<br>> 1. no mpd is running on this host<br>> 2. an mpd is running but was started without a "console" (-n option)<br>> mpdallexit: cannot connect to local mpd (/tmp/mpd2.console_velan); possible<br>> causes:<br>> 1. no mpd is running on this host<br>> 2. an mpd is running but was started without a "console" (-n option)<br>> ----------------------------------------------------------------------------------------------------------------------<br>><br>> my job file having the following details<br>> ----------------------------------------------------------------------------------------------------------------------<br>> #!/bin/bash<br>><br>> #PBS -l nodes=2:ppn=1<br>><br>> cd $HOME/2DSIM/1proc<br>><br>>
n=`/usr/local/bin/pbs2mpich2hosts.py $PBS_NODEFILE hosts`<br>><br>> /usr/local/bin/mpdboot -n $n -f hosts -r rsh --mpd=/usr/local/bin/mpd<br>> /usr/local/bin/mpiexec -n 1<br>> /home/aero/velan/2DSIM/1proc/pg170x91.exe<br>> /usr/local/bin/mpdallexit<br>> rm -f hosts<br>> ----------------------------------------------------------------------------------------------------------------------<br>><br>> can anybody please help me. Actually i not configured this machine and i am<br>> new to this. I thankyou verymuch for your kind help<br>><br>> Regards<br>> Velan<br>><br>><br>><br>> ________________________________<br>> Here's a new way to find what you're looking for - Yahoo! Answers<br>><br>><br>> _______________________________________________<br>> torqueusers mailing list<br>> torqueusers@supercluster.org<br>> http://www.supercluster.org/mailman/listinfo/torqueusers<br>><br>><br>><br><br><br>--
<br>Email: david.w.h.chin@gmail.com dwchin@lroc.harvard.edu<br>Public key: http://gallatin.physics.lsa.umich.edu/~dwchin/crypto.html<br> pub 1024D/1C557DDF 2006-07-21 [expires: 2007-07-21]<br> Key fingerprint = 4EEB A409 5010 3679 4EA7 D420 4E52 202A 1C55 7DDF<br></achillesvelan@yahoo.co.in></blockquote><br><p> 
        
        
                <hr size=1></hr>
Here’s a new way to find what you're looking for - <a href="http://us.rd.yahoo.com/mail/in/yanswers/*http://in.answers.yahoo.com/">Yahoo! Answers</a>