Alert monitor for Lenovo BladeCenter cooling module installation or removal

IBM.BladeCenter.CoolingModuleInstalledOrRemoved (UnitMonitor)


Alert monitor for Lenovo BladeCenter cooling module installation or removal

Knowledge Base article:

Summary

This monitor watches for a BladeCenter event that indicates that a fan module has been added to or removed from a blade server chassis.

Configuration

You can disable this monitor through the Operations Manager's Operations Console. See the "Disable monitors" topic in the Operations Manager's Operations User's Guide for more information.

The BladeCenter event is delivered to this monitor asynchronously. There is no monitoring interval to configure for this monitor.

The BladeCenter event is delivered to this monitor from the AMM (Advanced Management Module) of the BladeCenter via the SNMP (Simple Network Management Protocol) protocol. It also goes through the BladeCenter runtime support of the Hardware Management Pack installed on the management server that was designated to manage the BladeCenter during the Network Device Discovery process.

For the proper BladeCenter AMM SNMP settings that are required for the Hardware Management Pack to discover BladeCenter modules and report events, consult the Hardware Management Pack's User's Guide.

Causes

When a cooling fan has been added or removed, the BladeCenter's AMM generates a hardware event. The health state of this monitor is then set to the Critical or Warning state.

Detailed specifics about the cause of the hardware event are recorded in the alert data and in the state change record. The latest state change of this monitor reflects the severity level of the most recent hardware event recorded by this monitor.

Resolutions

Review the relevant Lenovo hardware knowledge articles listed above for information about how to resolve the hardware problem for a particular incident.

After the hardware problem is resolved, manually reset the health state of this monitor. However, any outstanding corresponding alerts will be automatically closed. See the "Reset Health" topic in the Operations Manager's Operations User's Guide for more information.

To verify that the hardware problem has been resolved, refer to the most recent health state of the corresponding "regular health checkup monitor." Be sure to refer to a health state that was reported later than the hardware event.

Additional

For the proper AMM SNMP settings needed for the Hardware Management Pack, see the "Configuring BladeCenter SNMP settings" topic in the Lenovo Hardware Management Pack for Microsoft System Center Operations Manager Installation and User's Guide.

External

Links to Lenovo resources

Element properties:

TargetIBM.BladeCenter.CoolingModule
Parent MonitorSystem.Health.AvailabilityState
CategoryCustom
EnabledTrue
Alert GenerateTrue
Alert SeverityMatchMonitorHealth
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeIBM.BladeCenter.SNMPTrap.2StateMonitorTypeForEventPair
RemotableTrue
AccessibilityPublic
Alert Message
Lenovo BladeCenter cooling module installation or removal

{0} -- EventID = {1}
RunAsDefault

Source Code:

<UnitMonitor ID="IBM.BladeCenter.CoolingModuleInstalledOrRemoved" Accessibility="Public" Enabled="true" Target="IBM.BladeCenter.CoolingModule" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="IBM.BladeCenter.SNMPTrap.2StateMonitorTypeForEventPair" ConfirmDelivery="false">
<Category>Custom</Category>
<AlertSettings AlertMessage="IBM.BladeCenter.CoolingModuleInstalledOrRemoved.AlertMessageID">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/SnmpVarBinds/SnmpVarBind[OID='1.3.6.1.4.1.2.6.158.3.1.1.8'][1]/Value$</AlertParameter1>
<AlertParameter2>$Data/Context/SnmpVarBinds/SnmpVarBind[OID='1.3.6.1.4.1.2.6.158.3.1.1.14'][1]/Value$</AlertParameter2>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="ComponentSuccess" MonitorTypeStateID="SuccessEventRaised" HealthState="Success"/>
<OperationalState ID="ComponentWarning" MonitorTypeStateID="FailedEventRaised" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<SuccessEventIds>16778035[3-6]</SuccessEventIds>
<FailedEventIds>167784449|16778445[0-2]</FailedEventIds>
</Configuration>
</UnitMonitor>