[torqueusers] Re: [torquedev] Some jobs not starting with Torque
2.3.1 and Moab
Josh Butikofer
josh at clusterresources.com
Fri Jul 18 08:07:39 MDT 2008
Lennart,
This was fixed in the latest TORQUE 2.3.2 snapshot available at
http://www.clusterresources.com/downloads/torque/torque-2.3.2-snap.200807092141.tar.gz.
Alternatively, if you are using Moab 5.2.3 revision 9927 or higher, you
can set the "NONEEDNODES=TRUE" parameter on your RMCFG[] line that
describes your TORQUE resource manager:
Ex: moab.cfg
RMCFG[base] TYPE=PBS NONEEDNODES=TRUE
--Josh Butikofer
Lennart Karlsson wrote:
> Chris Samuel wrote the 5th of July:
>> I'm not sure if this is a Torque or Moab bug or just the result
>> of a change in interaction between the two, so I'm report this
>> to both. :-)
>>
>> Torque 2.3.1 official release.
>>
>> # moab --version
>> moab server version 5.2.3 (revision 10590)
>>
>> We have a number of jobs that are not starting and are ending
>> up in BatchHold due to repeated failures. They are all logging
>> similar information:
>>
>> Message[30] cannot start job on reserved resources - job cannot be started on RM base - cannot set hostlist: cannot set job '472817.tango-m.vpac.org' attr 'Resource_List:neednodes' to 'tango048' - job may have been removed externally (rc: 15001 'Unknown Job Id')
>
>
> Hi,
>
> We installed 2.3.1 of Torque today, run version 5.2.3.s10693 of Moab,
> and get the same problem with some jobs.
>
> Where there ever some solution to this problem?
>
> Best regards,
> -- Lennart Karlsson <Lennart.Karlsson at nsc.liu.se>
> National Supercomputer Centre in Linkoping, Sweden
> http://www.nsc.liu.se
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
More information about the torquedev
mailing list