[torqueusers] Jobs don't run unless forced with qrun
David Beer
dbeer at adaptivecomputing.com
Thu Apr 7 12:05:58 MDT 2011
----- Original Message -----
> Hello Everyone,
>
> Just today, I have installed torque 2.5.6 version on one of our
> clusters. Everything went well. For example, pbsnodes -a shows the
> state of nodes being free. But when I submit jobs they just sit in the
> queue for ever. To make them run I need to qrun them as root. Then
> they run fine with out any problems. We are using MOAB scheduler.
>
> I have checked the server_logs and everything looks ok. No complaints
> at all. I have nodes file in server_priv showing the nodes present on
> our cluster. It would be great if someone could give me some
> suggestions on fixing this problem.
>
> The other problem I have is with pbstop. Just pbstop doesn't show
> anything unless I do pbstop @cuda (cuda is our server name). Is there
> something I should do to get it working with just command pbstop?
>
> I would greatly appreciate if someone could help me with this.
>
> Thanks,
> Sreedhar.
Sreedhar,
It sounds like Moab and TORQUE aren't talking to each other. I think the best way to resolve this would be to open a ticket with Adaptive Computing, and the support people can give you more dedicated service than you would from emailing the TORQUE users list. I have some basic familiarity with Moab (since I work for Adaptive) and my suggestion would be to check out what
mdiag -R -v
tells you. That'd probably be best included in the contents of the ticket you create with Adaptive though.
--
David Beer
Direct Line: 801-717-3386 | Fax: 801-717-3738
Adaptive Computing
1656 S. East Bay Blvd. Suite #300
Provo, UT 84606
More information about the torqueusers
mailing list