Clustered Data ONTAP MetroCluster: MetroCluster Monitoring Rule

DataONTAP.Cluster.MetroCluster.Monitoring.Monitoring.Rule (Rule)

The MetroCluster Monitoring Rule checks for MetroCluster events: whether a surviving partner cluster is in switchover mode; whether a restored partner cluster has suffered switchback effects (in which an aggregate or disk is not given back to the partner cluster); and whether a back-hoe event has severed all communications between partner clusters. It also checks storage bridge status; whether an FC initiator adapter port is online; and whether a broken connection has caused an intercluster LIF to fail to ping the partner cluster.

Knowledge Base article:

Summary

The MetroCluster Monitoring Rule checks for MetroCluster events: whether a surviving partner cluster is in switchover mode; whether a restored partner cluster has suffered switchback effects (in which an aggregate or disk is not given back to the partner cluster); and whether a back-hoe event has severed all communications between partner clusters. It also checks storage bridge status; whether an FC initiator adapter port is online; and whether a broken connection has caused an intercluster LIF to fail to ping the partner cluster.

Configuration

Three overrides are available for this rule. Sync Time and Interval Seconds determine when and how often the rule runs. Timeout Seconds determines how long System Center Operations Manager waits for the rule to complete.

Resolutions

If you suspect a problem with this rule, check the OnCommand and System Center Operations Manager event logs on the management server running the rule.

Additional Information

Event ID

Severity

Description

20120

INFO

OK - indicates that communications between partner clusters are healthy

20122

ERR

Critical - indicates that a back-hoe event has severed all communications between partner clusters

20130

INFO

OK - indicates that the surviving partner cluster is in switchover mode

20132

ERR

Critical - indicates that the surviving partner cluster has failed to go into switchover mode

20140

INFO

OK - indicates that the FC initiator adapter port is online

20142

ERR

Critical - indicates that the FC initiator adapter port is offline

20150

INFO

OK - indicates that an inter-cluster LIF is able to ping the partner cluster

20152

ERR

Critical - indicates that an inter-cluster LIF is not able to ping the partner cluster

20170

INFO

OK - indicates that a restored partner cluster has not suffered switchback effects

20172

ERR

Critical - indicates that an aggregate or disk has not been given back to a restored partner cluster after switchback

20200

INFO

OK - indicates that the storage bridge temperature status is normal and the bridge SAS or FC ports are enabled and online

20201

WARN

Warning - indicates that the storage bridge temperature status is warming, that the bridge SAS or FC ports are disabled and offline, or that the bridge is not being monitored by Data ONTAP

20202

ERR

Critical - Warning - indicates that the storage bridge temperature status is critical or that the bridge SAS or FC ports are enabled and offline

Element properties:

TargetDataONTAP.Cluster.MetroCluster.MetroCluster
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateFalse
RemotableTrue

Member Modules:

ID Module Type TypeId RunAs 
SimpleSchedulerDataSource DataSource System.SimpleScheduler Default
RunMonitoringPowershellScript WriteAction DataONTAP.Cluster.MetroCluster.Monitoring.WriteActionModuleType Default

Source Code:

<Rule ID="DataONTAP.Cluster.MetroCluster.Monitoring.Monitoring.Rule" Target="DataONTAP.Cluster.MetroCluster.MetroCluster" Enabled="true" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>AvailabilityHealth</Category>
<DataSources>
<DataSource ID="SimpleSchedulerDataSource" TypeID="System!System.SimpleScheduler">
<!-- IntervalSeconds specifies how often we will run the rule. -->
<IntervalSeconds>3600</IntervalSeconds>
<!-- SyncTime specifies the minutes after the hour to synchronize execution of the rule. -->
<SyncTime>00:16</SyncTime>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="RunMonitoringPowershellScript" TypeID="DataONTAP.Cluster.MetroCluster.Monitoring.WriteActionModuleType">
<TimeoutSeconds>600</TimeoutSeconds>
<MonitoringMethodName>GetMetroClusterMonitoringStatus</MonitoringMethodName>
<VserverUUID>$Target/Property[Type="DataONTAP.Cluster.MetroCluster.MetroCluster"]/ClusterUUIDs$</VserverUUID>
</WriteAction>
</WriteActions>
</Rule>