<div dir="ltr"><div>Hello,</div>
<div>We have a Torque/PBS/MAUI system with 30 compute nodes.</div>
<div>During the last few days, we expirience a strange phenomena : certain compute nodes go to 'State: Drained' , while the cluster admins didn't ask these nodes to go to drained mode.</div>
<div> </div>
<div>We found that using the following command says:</div>
<div> </div>
<div><br>===============================</div>
<div>[root@cluster sbin]$ checknode node23</div>
<div>checking node <a href="http://bioc23.tau.ac.il">bioc23.tau.ac.il</a><br>State: Drained (in current state for 1:02:30:43)</div>
<div>===============================</div>
<div> </div>
<div>We can set the node back to Idle by using</div>
<div> </div>
<div>=============================== </div>
<div>mnodectl host=node23 modify state=Idle</div>
<div>===============================</div>
<div> </div>
<div>This works fine. But why do nodes go drained state without being asked for, and how can we troubleshoot or prevent this from happenins?</div>
<div> </div>
<div>Thanks,</div>
<div>Itay.</div>
<div> </div></div>