>Hi,<br>
<br>
>We have Torque Server Version 2.5.8 and maui version 3.2.6p1 installed on<br>
>rhel 5.2 server. "showstart" for one of the jobs says that job should start<br>
>now i.e.<br>
<br>
>Earliest start in 00:00:00 on current time.<br>
>########################<br>
>checkjob -vv says that<br>
<br>
>checkjob -vv 62235<br>
>checking job 62235 (RM job '62235.yc9.cn.yuva.param')<br>
>State: Idle<br>
>Creds: user:abcd group:pqr account:PQR-PR class:q1 qos:q1-qos<br>
>WallTime: 00:00:00 of 2:05:00:00<br>
>SubmitTime: Thu Feb 23 18:56:26<br> >(Time Queued Total: 1:21:27:05 Eligible: 1:21:27:05)<br>
<br>
>Total Tasks: 2<br>
<br>
>Req[0] TaskCount: 2 Partition: ALL<br>
>Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0<br>
>Opsys: [NONE] Arch: [NONE] Features: [NONE]<br>
>Exec: '' ExecSize: 0 ImageSize: 0<br>
>Dedicated Resources Per Task: PROCS: 1<br>
>NodeAccess: SHARED<br>
>NodeCount: 0<br>
>IWD: [NONE] Executable: [NONE]<br>
>Bypass: 51 StartCount: 0<br>
>PartitionMask: [ALL]<br>
>Reservation '62235' (00:00:00 -> 2:05:00:00 Duration: 2:05:00:00)<br>
>PE: 2.00 StartPriority: 2727<br>
>job cannot run in partition DEFAULT (insufficient idle procs available: 0 <<br>
>2)<br>>job can run in partition P1 (32 procs available. 2 procs required)<br>
>job can run in partition P2 (48 procs available. 2 procs required)<br>
>########################<br>
>showres -n 62235 says that<br>
<br>
>reservations on Sat Feb 25 16:28:10<br>
<br>
> NodeName Type ReservationID JobState Task<br>
Start Duration StartTime<br>
<br>> node16.clusternode Job 62235 Idle 2<br>
00:00:00 2:05:00:00 Sat Feb 25 16:28:10<br>
>1 nodes reserved<br>
############################<br>
>checknode node16.clusternode says that node is available for job run.<br>
<br>
>but somehow job is not going and is not giving any error in maui,<br>
pbs_server,pbs_mom logs also.<br>
<br>
>What can be the issue?<br><br>Have you seen that Maui is starting the job in maui.log? If yes, then there might be the communication problem with TORQUE.<br><br>>What can be done to make job run and avoid the same in future?<br>
<br>How many partitions you have in you cluster?<br><br>Can you try to submit the job by specifying the PARTITION as follows:<br><br>qsub -q <queue_name> -l nodes=<requirement> -W x=PARTITION:<partition name><br>
<br>
>thank you<br>
<br>
>-pankakjd<br clear="all"><br>-- <br><br>Thanks & Regards,<br>Jayavant Ningoji Patil<br>+91 9923536030.<br><br>