[Moabusers] RE table is corrupt
Brock Palen
brockp at umich.edu
Thu Jan 17 09:03:29 MST 2008
Dave,
Here is the output:
Moab Server 'Moab' running on nyx.engin.umich.edu:42559 (Mode: NORMAL)
Build Info: 64,MCOMMTHREAD
Process Info: pid:14533 uid:0 euid:0 gid:0 egid:0
RM MODULES: SSS,WIKI,NATIVE,PBS
Load(5m) Sched: 9.99% RMAction: 8.39% RMQuery: 25.17% User:
0.00% Idle: 56.45%
Load(24h) Sched: 10.44% RMAction: 9.14% RMQuery: 26.64% User:
0.00% Idle: 53.78%
Total Memory Size: 544 MB
WARNING: excessive memory in use (544 MB) - restart Moab?
PollInterval: 00:01:30 (Avg Sched Interval: 00:00:35 Iterations:
1020)
JobStarts: 1769 (Avg Starts/Iteration: 1.73 Last Iteration: 0)
Object Specs: Class=50 GRes=512/512 Job=20480/1024
Node=5120 Par=31 Range=256 RM=16 Rsv=4096 UIBuffer=2MB User=1792
Message: profiling enabled (22 of 50 samples/00:30:00 interval)
The problem has passed though, the only overlap was that a user was
bulk submitting about 1800 jobs, and torque was running very slow.
We had noticed this behavior with qstat, qsub and qmgr being slow to
respond. We are now testing some of the advice from:
http://www.clusterresources.com/wiki/doku.php?
id=torque:appendix:f_large_cluster_considerations
We are at 608 nodes installed on that cluster.
Also 544MB isnt right, according to top Moab is using 715MB (1326MB
allocated)
Thanks
Brock Palen
Center for Advanced Computing
brockp at umich.edu
(734)936-1985
On Jan 17, 2008, at 9:03 AM, Dave Jackson wrote:
> Brock,
>
> Do you get any warnings of interest from 'mdiag -S -v'?
>
> Dave
>
> On Thu, 2008-01-17 at 01:02 -0500, Brock Palen wrote:
>> A bunch of our nodes just were marked down and not sure why, torque
>> (qmgr) still thinks they are up, the machines are up, moab thinks
>> they are down and i see lots of:
>>
>> 01/17 00:31:55 ALERT: node nyx539 RE table is corrupt. RE[6]
>> 'rmfailure.3586' at -00:00:57 is out of time order
>>
>> messages, Is there a way to fix this?
>>
>>
>> Brock Palen
>> Center for Advanced Computing
>> brockp at umich.edu
>> (734)936-1985
>>
>>
>> _______________________________________________
>> moabusers mailing list
>> moabusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/moabusers
>
>
>
More information about the moabusers
mailing list