Regular health checkup monitor for Lenovo BladeCenter module

IBM.BladeCenter.Module.HealthState (UnitMonitor)


Regular health checkup monitor for Lenovo BladeCenter module

Knowledge Base article:

Summary

This monitor regularly checks for the overall health state of a BladeCenter module.

This monitor reports a module incident that occurred before the system started being monitored. It also determines whether to close the pending alerts associated with the module, or reset the state of the monitors for the module.

Configuration

You can disable this monitor through the Operations Manager's Operations Console. See the "Disable monitors" topic in the Operations Manager's Operations User's Guide for more information.

You can also change the interval between the health checkups by overriding the value of the "IntervalSeconds" parameter of the monitor. See the "Override" topic in the Operations Manager's Operations User's Guide.

The BladeCenter event is delivered to this monitor from the AMM (Advanced Management Module) of the BladeCenter via the SNMP (Simple Network Management Protocol) protocol. It also goes through the BladeCenter runtime support of the Hardware Management Pack installed on the management server that was designated to manage the BladeCenter during the Network Device Discovery process.

For the proper BladeCenter AMM SNMP settings that are required for the Hardware Management Pack to discover BladeCenter modules and report events, consult the Hardware Management Pack's User's Guide.

Causes

For, details about the module incident, review other monitors. When no other alert/warning for the module is found, review the events in the Events view. Then open the Lenovo BladeCenter Web Console console task in the Actions view and review existing events. The latest state of this monitor reflects the severity level of the most recent overall health state of the module.

Resolutions

Review the health checkup report's details about the given module. Contact Lenovo support (see links below) if the reports or relevant articles do not provide enough information to help you resolve the problem.

After the problem is resolved, the overall health state of this monitor is automatically restored to the Healthy state. However, you must manually close any corresponding alerts that might have occurred.

Additional

For the proper AMM SNMP settings needed for the Hardware Management Pack, see the "Configuring BladeCenter SNMP settings" topic in the Lenovo Hardware Management Pack for Microsoft System Center Operations Manager Installation and User's Guide.

External

Links to Lenovo resources

Element properties:

TargetIBM.BladeCenter.Module
Parent MonitorSystem.Health.PerformanceState
CategoryCustom
EnabledTrue
Alert GenerateFalse
Alert Auto ResolveTrue
Monitor TypeIBM.BladeCenter.ModulesHealth.MonitorType
RemotableTrue
AccessibilityPublic
RunAsDefault

Source Code:

<UnitMonitor ID="IBM.BladeCenter.Module.HealthState" Accessibility="Public" Enabled="true" Target="IBM.BladeCenter.Module" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="IBM.BladeCenter.ModulesHealth.MonitorType" ConfirmDelivery="false">
<Category>Custom</Category>
<OperationalStates>
<OperationalState ID="Healthy" MonitorTypeStateID="Healthy" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
<OperationalState ID="Critical" MonitorTypeStateID="Error" HealthState="Error"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>7200</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<ModuleType>Module</ModuleType>
<IPAddress>$Target/Property[Type="IBM.BladeCenter.Module"]/PrimaryMMIPAddress$</IPAddress>
<CommunityString>$Target/Property[Type="IBM.BladeCenter.Module"]/CommunityString$</CommunityString>
<DisplayName>$Target/Property[Type="System!System.Entity"]/DisplayName$</DisplayName>
</Configuration>
</UnitMonitor>