[Moabusers] configuration issues on diskless cluster

rishi pathak mailmaverick666 at gmail.com
Mon Mar 17 23:29:46 MDT 2008


Hi,
   We first have to check whether all the moms and server talk to each
other.
Once you are through with resource manager then you can jump to scheduler.

Do you have multiple interfaces on nodes.
If yes then are the interfaces on same network or different.
What does the output of pbsnodes -a show.
Check for the hostnames of moms on $PBSHOME/server_privs/nodes file.
All the hostnames should corresponde to ip's of same network.
Check the server_name for each mom.The ip to this should also be on the same
network.

After you start a mom on a node, it sends a hello to the servers, post that
portion of log.

P.S. Please enclose a copy of your mail to mailing list.By not doing so you
are de voiding yourself from ocean of help.

On Mon, Mar 17, 2008 at 9:12 PM, Gelonia L Dent <gdent at amnh.org> wrote:

> Thanks, for your response. I've recently inherited this cluster and am
> trying to get a scheduler to work on it. SGE is installed and sge_master
> is running. bproc is installed but apparently was never configured
> properly. Should I strip all of these from the system?
>
> --
> Gelonia Dent, PhD
> Manager of Scientific Computing
> Invertebrate Zoology
> The American Museum of Natural History
> (212) 313-7911
>
>
>
>
> > Do your nodes have multiple interfaces.
> > Sometimes the problem is with the mom using some other interface to talk
> > to
> > server.
> >
> >
> > On Wed, Mar 12, 2008 at 3:50 AM, Gelonia L Dent <gdent at amnh.org> wrote:
> >
> >> Dear All,
> >>
> >> I  am having trouble establishing a connection with the nodes on  my
> >> cluster. They are  reporting as down, but I've launched an mpirun on
> all
> >> the nodes and they work.
> >>
> >> Issuing the mdiag command gives the following output:
> >>
> >> demeter:~# mdiag -n
> >> compute node summary
> >> Name                    State   Procs      Memory         Opsys
> >>
> >> node0                    Idle    2:2      3932:3932     DEFAULT
> >> node1                    Idle    2:2      3932:3932     DEFAULT
> >> node2                    Down    0:2      3932:3932     DEFAULT
> >> node3                    Idle    2:2      3932:3932   Linux-2.6
> >> node4                    Idle    2:2      3932:3932     DEFAULT
> >> node5                    Idle    2:2      3932:3932     DEFAULT
> >> node6                    Down    0:2         1:1        DEFAULT
> >> node7                    Down    0:2         1:1        DEFAULT
> >> node8                    Down    0:2         1:1        DEFAULT
> >> node9                    Down    0:2         1:1        DEFAULT
> >>
> >> the remaining nodes up to 126 are all down as well.
> >>
> >> Why would all nodes have the default OS instead of Linux?
> >>
> >> Any insight would be helpful.
> >>
> >> --
> >> Gelonia Dent, PhD
> >> Manager of Scientific Computing
> >> Invertebrate Zoology
> >> The American Museum of Natural History
> >> (212) 313-7911
> >>
> >>
> >>
> >> _______________________________________________
> >> moabusers mailing list
> >> moabusers at supercluster.org
> >> http://www.supercluster.org/mailman/listinfo/moabusers
> >>
> >
> >
> >
> > --
> > Regards--
> > Rishi Pathak
> > National PARAM Supercomputing Facility
> > Center for Development of Advanced Computing(C-DAC)
> > Pune University Campus,Ganesh Khind Road
> > Pune-Maharastra
> >
>
>
>


-- 
Regards--
Rishi Pathak
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/moabusers/attachments/20080318/ad1fbd17/attachment.html


More information about the moabusers mailing list