[Moabusers] trouble setting node based triggers
Martins, Flavio
flavio.martins at fttinc.com
Mon Aug 20 13:58:12 MDT 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Thanks for the info Douglas, that was a partial help. I see the triggers
listed with mdiag -T now, but they are still not working right. I still
don't see any gmetric entry in the mdiag -n output and my GLOBAL node
trigger is not firing.
Here is the output from "checknode -v GLOBAL"
node GLOBAL
State: Idle (in current state for 1:53:28)
Configured Resources:<license data>
Utilized Resources:<license data>
Dedicated Resources: ---
Generic Metrics: diskuse=0.55
MTBF(longterm): INFINITY MTBF(24h): INFINITY
Partition: ALL Rack/Slot: ---
Flags: havegresavailinfo,rmdetected
RM[ANSYSFlx TYPE=NATIVE:AGFULL
NodeAccessPolicy: SHARED
Total Time: 2:14:02 Up: 2:14:02 (100.00%) Active: 00:00:00 (0.00%)
Reservations: ---
TrigID Object ID Event AType ActionDate
State
- --------------------- -------------------- -------- ------
- -----------------
33* node:GLOBAL threshol mail -
Successful
Launch Time: -1:53:28
BlockTime: INFINITY ActiveTime: 00:00:00
Threshold: GMetric[diskuse] > 0.50
Action Data: Master node exceeded 0.5 diskusage
Variables=
* indicates trigger has completed
As you can see, checknode correctly shows gmetric[diskuse]=0.55 and yet
the
Trigger for node GLOBAL does not fire for the condition gmetric > 0.5
Perhaps there is something unique about the "GLOBAL" node that is
interfering here. All other nodes on the cluster have diskusage well
below the 0.5 value.
Flavio Martins
Senior Engineer
Aerodynamics / CFD
Florida Turbine Technologies Inc.
100 Marquette Road
Suite 110
Jupiter, FL 33458-7101
Phone: (561) 427-6261
Fax: (561) 427-6191
- -----Original Message-----
From: Douglas Wightman [mailto:wightman at clusterresources.com]
Sent: Monday, August 20, 2007 12:50 PM
To: Martins, Flavio
Cc: moabusers at supercluster.org
Subject: Re: [Moabusers] trouble setting node based triggers
If you don't capitalize GMETRIC then this will work:
Change:
NODECFG[DEFAULT] TRIGGER=atype=mail,action='Node $OID exceeded 0.5
diskusage',etype=threshold,threshold=GMETRIC[diskuse]>0.5
to
NODECFG[DEFAULT] TRIGGER=atype=mail,action='Node $OID exceeded 0.5
diskusage',etype=threshold,threshold=gmetric[diskuse]>0.5
And the triggers will show up in mdiag -T.
- - Douglas
On Fri, 2007-08-17 at 18:51 -0400, Martins, Flavio wrote:
> I have been trying to set a node based trigger to e-mail me if disk
> space runs low. I set up a native RM to query disk usage on my master
> node and my compute nodes and report the usage number as a GMETRIC. I
> then set a mail trigger to fire if disk usage surpassed a certain
> percentage. The problem is that so far I have not been able to get
> this to work.
>
>
>
> Some general observations:
>
> The gmetric values shows up if I do a nodecheck, but not on mdiag -n.
>
> I have no listings from the mdiag -T command, so the triggers don't
> seem to be picked up.
>
> I don't see any errors or alerts in the moab.log file about the
> triggers.
>
>
>
> Here is how I tried to set it up:
>
>
>
> Here is the native RM to get disk usage data - moab.cfg
>
> RMCFG[disk] TYPE=NATIVE RESOURCETYPE=FS
>
> RMCFG[disk] CLUSTERQUERYURL=exec:///opt/moab/tools/disk_check.pl
>
>
>
> Disk_check.pl produces the following output:
>
> GLOBAL GMETRIC[diskuse]=0.553221342092917
>
> 0 GMETRIC[diskuse]=0.000442725244643413
>
> 1 GMETRIC[diskuse]=0.000442725244643413
>
> 2 GMETRIC[diskuse]=0.000499726374230815
>
> 3 GMETRIC[diskuse]=0.000442725244643413
>
> 4 GMETRIC[diskuse]=0.000442725244643413
>
> 5 GMETRIC[diskuse]=0.334928463873042
>
> 6 GMETRIC[diskuse]=0.228136833057353
>
> (My master node is not available for running jobs, so I assigned its
> disk usage to the global node for trigger setting purposes)
>
>
>
> Then I set up my triggers based on the moab documentation found here:
>
http://www.clusterresources.com/products/mwm/docs/9.2accounting.shtml#gm
etric
>
> The example on this page is nearly identical to what I am trying to
> do.
>
>
>
> NODECFG[DEFAULT] TRIGGER=atype=mail,action='Node $OID exceeded 0.5
> diskusage',etype=threshold,threshold=GMETRIC[diskuse]>0.5
>
> NODECFG[GLOBAL] TRIGGER=atype=mail,action='Master node exceeded 0.5
> diskusage',etype=threshold,threshold=GMETRIC[diskuse]>0.5
>
>
>
> The disk usage on the GLOBAL node is greater then 0.5 so the trigger
> should fire.
>
>
>
> Can anyone see anything wrong with this setup?
>
>
>
> Flavio Martins
>
> Senior Engineer - Aerodynamics / CFD
>
> Florida Turbine Technologies Inc.
>
> 100 Marquette Road, Suite 110
>
> Jupiter, FL 33458-7101
>
> -----------------------------------------------------
>
> Phone: (561) 427-6261
>
> Fax: (561) 427-6191
>
>
>
>
> _______________________________________________
> moabusers mailing list
> moabusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/moabusers
-----BEGIN PGP SIGNATURE-----
Version: PGP Universal 2.6.1
Charset: us-ascii
wsBVAwUBRsnx3jxaeRbmFE+LAQibFggA6WUfSbI6su3rH2E+HkNID0UtGxclcaXq
qYv8cE7sZ5RjHHge5k0ytnlfFYpky48fyrXhK5ahnHBFg+4WY/ZmebpaaI5H691R
dgnsvPqUmAJek8nTdYptrueGfpDaZXRO97Bwxk7NFiY8k3qCN1zBSb2bQnl3d8zX
rDDgfKYy6X2cHU02zVDrqTuc+K8Sc/Vg8pUxU2RTvcbsmy8OIrTYCfBrmrNIGalU
Q1BMtwUxMsCpX1YzffVD1RJmudxQEI3RqJZa4QuD7R050IcTpQQJt0noydy3m1tq
PlUNyQHUMvFdiVirrfHp4jV5Mghiwqv1aIpssr49Lpin33bm427SOQ==
=VL8G
-----END PGP SIGNATURE-----
More information about the moabusers
mailing list