Health Service Handle Count Threshold

Microsoft.SystemCenter.Agent.HealthService.HandleCountThreshold (UnitMonitor)

This monitor ensures that the "Process\Handle Count" counter for the "HealthService.exe" process does not exceed a set threshold over a series of consecutive samples. If the conditions are met, this monitor will change to a critical state, which will then roll up to the "Health Service State" monitor. The "Health Service State" monitor is configured to run a recovery when its state is critical, which will automatically attempt to restart the System Center Management Health Service.

Knowledge Base article:

Summary

This unit monitor measures the Process\Handle Count utilization for the Health Service process. If it exceeds the configured threshold, a recovery attempts to restart the System Center Management Health Service to ensure it doesn't continue to overwhelm the computer.

There are different thresholds depending on the role that the System Center Management Health Service is configured to perform. The following summarizes the default thresholds:

System Center Management Health Service Role

Handle Count Threshold

Agent

6,000

Management Server

10,000

Below is the configuration for the recovery that attempts to restart the System Center Management Health Service:

System Center Management Health Service Role

Restart Recovery Behavior

Agent

Enabled

Management Server

Disabled

Causes

A brief summary of potential causes are:

Too many rules and monitors are loaded from all the management packs this System Center Management Health Service has been configured with.

A misconfigured rule or monitor collecting too much data or processing too much data (e.g. performance counter collection rule collecting data every 1 second)

This can be caused by the System Center Management Health Service running many management packs. Each Management Pack may have a lot of monitoring that uses a small amount of resources. With many management packs that add up to many thousands of rules and monitors, the System Center Management Health Service may start consuming more resources.

This may be expected for this System Center Management Health Service depending on the type of monitoring the System Center Management Health Service is performing.

Another cause could be one or more rules and monitors that are not conforming to some best practices. An example is a performance counter rule that attempts to collect performance data every 1 second. Too many rules or monitors configured this way will cause the System Center Management Health Service and it related process to consume more resources.

Resolutions

The default action for this monitor running on agents is to restart the System Center Management Health Service. Because this recovery is enabled by default on agents, no user action is required.

If you still see this monitor in a critical state, the System Center Management Health Service may not have restarted correctly or the action account this agent has been configured with does not have the right permissions to restart the service.

If this is the case, start the System Center Management Health Service windows service.

The hotfix provided in Knowledge Base article 968760 can correct some issues that result in this monitor changing to a critical state. Ensure that the hotfix from Knowledge Base article 968760 (http://go.microsoft.com/fwlink/?LinkId=196234) has been installed on any computers that are using too much memory.

Element properties:

TargetMicrosoft.SystemCenter.HealthService
Parent MonitorMicrosoft.SystemCenter.HealthService.ServiceStateRollup
CategoryPerformanceHealth
EnabledTrue
Alert GenerateFalse
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SystemCenter.Agent.Performance.ConsecutiveSamplesThreshold.MonitorType
RemotableTrue
AccessibilityPublic
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.SystemCenter.Agent.HealthService.HandleCountThreshold" Accessibility="Public" Enabled="true" Target="SCLibrary!Microsoft.SystemCenter.HealthService" ParentMonitorID="Microsoft.SystemCenter.HealthService.ServiceStateRollup" Remotable="true" Priority="Normal" TypeID="Microsoft.SystemCenter.Agent.Performance.ConsecutiveSamplesThreshold.MonitorType" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<OperationalStates>
<OperationalState ID="HandleCountUnderThreshold" MonitorTypeStateID="UnderThreshold" HealthState="Success"/>
<OperationalState ID="HandleCountOverThreshold" MonitorTypeStateID="OverThreshold" HealthState="Error"/>
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/PrincipalName$</ComputerName>
<ObjectName>Process</ObjectName>
<CounterName>Handle Count</CounterName>
<InstanceName>HealthService</InstanceName>
<AllInstances>false</AllInstances>
<Frequency>120</Frequency>
<NumSamples>5</NumSamples>
<Threshold>30000</Threshold>
<Direction>greater</Direction>
</Configuration>
</UnitMonitor>