Regular health checkup monitor for Lenovo System power supply condition

IBM.SystemX.PowerSupply.ComponentHealth (UnitMonitor)


Regular health checkup monitor for Lenovo System power supply condition

Knowledge Base article:

Summary

This monitor regularly checks for the overall health state of the specific power supply on the given system.

This monitor reports hardware problems of the power supply that occurred before the system started being monitored. It also determines whether to close the pending alerts associated with the power supply, or reset the state of the monitors for the power supply.

Configuration

You can disable this monitor through the Operations Manager's Operations Console. See the "Disable monitors" topic in the Operations Manager's Operations User's Guide for more information.

You can also change the interval between the health checkups by overriding the value of the "IntervalSeconds" parameter of the monitor. See the "Override" topic in the Operations Manager's Operations User's Guide.

The hardware event with this monitor is available only on an Lenovo system with the appropriate hardware sensors and with a management controller (also called a Service Processor), such as Integrated Management Module (IMM), Baseboard Management Controller (BMC), Remote Supervisor Adapter (RSA), or an equivalent management controller on an older Lenovo system.

This monitor depends on hardware instrumentation software, namely the IBM Director Platform Agent (also called Core Services) and the Intelligent Platform Management Interface (IPMI) driver stack. This software raises the hardware event to the WMI level, so that the monitor can be notified. On certain configurations, the RSA daemon can be used in place of, or in parallel with, the IPMI driver stack. See the "Additional Information" section below for more information about Lenovo Director Platform Agent, the IPMI driver stack and the RSA daemon.

Causes

Details about the cause of the hardware problems of the power supply are recorded in the alerts and in the state change record.The latest state of this monitor reflects the severity level of the most recent overall health state of the power supply.

Resolutions

Review the health checkup report's details about the given power supply. Contact Lenovo support (see links below) if the reports or relevant articles do not provide enough information to help you resolve the hardware problem.

After the hardware problem is resolved, the overall health state of this monitor is automatically restored to the Healthy state. However, you must manually close any corresponding alerts that might have occurred.

Additional

External

Links to Lenovo resources

Element properties:

TargetIBM.SystemX.PowerSupply
Parent MonitorSystem.Health.PerformanceState
CategoryCustom
EnabledTrue
Alert GenerateFalse
Alert Auto ResolveTrue
Monitor TypeIBM.HwComponents.Health.MonitorType
RemotableTrue
AccessibilityPublic
RunAsDefault

Source Code:

<UnitMonitor ID="IBM.SystemX.PowerSupply.ComponentHealth" Accessibility="Public" Enabled="true" Target="IBM.SystemX.PowerSupply" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="IBM.HwComponents.Health.MonitorType" ConfirmDelivery="false">
<Category>Custom</Category>
<OperationalStates>
<OperationalState ID="Healthy" MonitorTypeStateID="Healthy" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
<OperationalState ID="Critical" MonitorTypeStateID="Error" HealthState="Error"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>7200</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<ComponentID>$Target/Property[Type="IBM.SystemX.HWComponent"]/InstanceID$</ComponentID>
</Configuration>
</UnitMonitor>