Veeam VMware Collector: Health Service state change flow stalled

Veeam.Virt.Extensions.VMware.HealthService.StateChangeFlowStalled (Rule)

Knowledge Base article:

Summary

This rule is one of a set in the Veeam MP for VMware that monitors the status of the Ops Mgr Health Service running on Veeam Collector servers.

Specific events indicating data processing issues, workflow failures and resource bottlenecks are monitored.

Source: HealthService

Event ID: 5300 OR 5302 OR 5304

Level: Error

Description: Local health service is not healthy. Monitor state change flow is stalled with pending acknowledgement.

Causes

The Veeam Collector publishes data gathered from VMware vCenter for the local Ops Mgr Health Service to consume. This agentless monitoring method can place a high load on the Health Service. In the following situations the Ops Mgr Health Service may enter an unstable state:

Resolutions

Review the alert description for more detail on the cause. Check the repeat count - if this is very high it indicates multiple sustained failures to process data and requires investigation.

Note that some sporadic errors may occur during a short period of very heavy load, such as initial discovery processing (when a new vCenter target is added to the Veeam Virtualization Extensions UI, or when a large number of monitoring Jobs are moved to this Collector). If no new errors are logged after initial discovery, and the Health Service has stabilized (review the Operations Manager event log for errors), then this alert can be safely closed. However review the guidance below to understand if the Veeam MP for VMware configuration can be better optimized.

Further details on the possible causes and troubleshooting steps for each are given below.

Review this Microsoft KB article to see more detail on troubleshooting unresponsive Health Service issues.

Use the Alerts View to see all current open issues for this object. Use the Events View to review any error and warning events for this object. Open a Performance View to see the performance metrics for this object and all contained objects. Open a Diagram View to analyze the relationships of this object to other components.

External

See the Help Center for more information including reference lists of all Rules and Monitors and full set of User Guides for the Veeam MP for VMware.

See the VMware Online Documentation for more information on VMware vSphere, in particular:

Element properties:

TargetVeeam.Virt.Extensions.VMware.Collector
CategoryEventCollection
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
RemotableTrue
Alert Message
Veeam VMware Collector: Health Service state change flow stalled
{0}
Event LogOperations Manager

Member Modules:

ID Module Type TypeId RunAs 
DS DataSource Microsoft.Windows.EventProvider Default
Alert WriteAction System.Health.GenerateAlert Default

Source Code:

<Rule ID="Veeam.Virt.Extensions.VMware.HealthService.StateChangeFlowStalled" Enabled="onEssentialMonitoring" Target="Veeam.Virt.Extensions.VMware.Collector" ConfirmDelivery="true" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>EventCollection</Category>
<DataSources>
<DataSource ID="DS" TypeID="Windows!Microsoft.Windows.EventProvider">
<ComputerName>.</ComputerName>
<LogName>Operations Manager</LogName>
<Expression>
<And>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="String">PublisherName</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="String">HealthService</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<Or>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="UnsignedInteger">5300</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="UnsignedInteger">5302</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="UnsignedInteger">5304</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
</Or>
</Expression>
</And>
</Expression>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="Alert" TypeID="SystemHealth!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>1</Severity>
<AlertOwner/>
<AlertMessageId>$MPElement[Name="Veeam.Virt.Extensions.VMware.HealthService.StateChangeFlowStalled.AlertMessage"]$</AlertMessageId>
<AlertParameters>
<AlertParameter1>$Data/EventDescription$</AlertParameter1>
</AlertParameters>
<Suppression>
<SuppressionValue/>
</Suppression>
</WriteAction>
</WriteActions>
</Rule>