[Moabusers] configuration issues on diskless cluster
rishi pathak
mailmaverick666 at gmail.com
Mon Mar 17 23:29:46 MDT 2008
Hi,
We first have to check whether all the moms and server talk to each
other.
Once you are through with resource manager then you can jump to scheduler.
Do you have multiple interfaces on nodes.
If yes then are the interfaces on same network or different.
What does the output of pbsnodes -a show.
Check for the hostnames of moms on $PBSHOME/server_privs/nodes file.
All the hostnames should corresponde to ip's of same network.
Check the server_name for each mom.The ip to this should also be on the same
network.
After you start a mom on a node, it sends a hello to the servers, post that
portion of log.
P.S. Please enclose a copy of your mail to mailing list.By not doing so you
are de voiding yourself from ocean of help.
On Mon, Mar 17, 2008 at 9:12 PM, Gelonia L Dent <gdent at amnh.org> wrote:
> Thanks, for your response. I've recently inherited this cluster and am
> trying to get a scheduler to work on it. SGE is installed and sge_master
> is running. bproc is installed but apparently was never configured
> properly. Should I strip all of these from the system?
>
> --
> Gelonia Dent, PhD
> Manager of Scientific Computing
> Invertebrate Zoology
> The American Museum of Natural History
> (212) 313-7911
>
>
>
>
> > Do your nodes have multiple interfaces.
> > Sometimes the problem is with the mom using some other interface to talk
> > to
> > server.
> >
> >
> > On Wed, Mar 12, 2008 at 3:50 AM, Gelonia L Dent <gdent at amnh.org> wrote:
> >
> >> Dear All,
> >>
> >> I am having trouble establishing a connection with the nodes on my
> >> cluster. They are reporting as down, but I've launched an mpirun on
> all
> >> the nodes and they work.
> >>
> >> Issuing the mdiag command gives the following output:
> >>
> >> demeter:~# mdiag -n
> >> compute node summary
> >> Name State Procs Memory Opsys
> >>
> >> node0 Idle 2:2 3932:3932 DEFAULT
> >> node1 Idle 2:2 3932:3932 DEFAULT
> >> node2 Down 0:2 3932:3932 DEFAULT
> >> node3 Idle 2:2 3932:3932 Linux-2.6
> >> node4 Idle 2:2 3932:3932 DEFAULT
> >> node5 Idle 2:2 3932:3932 DEFAULT
> >> node6 Down 0:2 1:1 DEFAULT
> >> node7 Down 0:2 1:1 DEFAULT
> >> node8 Down 0:2 1:1 DEFAULT
> >> node9 Down 0:2 1:1 DEFAULT
> >>
> >> the remaining nodes up to 126 are all down as well.
> >>
> >> Why would all nodes have the default OS instead of Linux?
> >>
> >> Any insight would be helpful.
> >>
> >> --
> >> Gelonia Dent, PhD
> >> Manager of Scientific Computing
> >> Invertebrate Zoology
> >> The American Museum of Natural History
> >> (212) 313-7911
> >>
> >>
> >>
> >> _______________________________________________
> >> moabusers mailing list
> >> moabusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/moabusers
> >>
> >
> >
> >
> > --
> > Regards--
> > Rishi Pathak
> > National PARAM Supercomputing Facility
> > Center for Development of Advanced Computing(C-DAC)
> > Pune University Campus,Ganesh Khind Road
> > Pune-Maharastra
> >
>
>
>
--
Regards--
Rishi Pathak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/moabusers/attachments/20080318/ad1fbd17/attachment.html
More information about the moabusers
mailing list