Job Queue Length

Microsoft.SystemCenter.ServiceManagementAutomation.Monitor.MessageQueueLength (UnitMonitor)

Checks the number of jobs in the Runbook worker queue.

Knowledge Base article:

Summary

Determines if the Job Queue length has exceeded the specified threshold. This queue is stored in the Service Management Automation database, so the monitor must be configured to have access to the database. See the Configuration section for information.

Causes

If the job queue length has reached critical limit then workers are not able to adequately handle the job load. Rate of job submission is higher than the rate of job processing.

Resolution

Increase the number of worker roles.

Configuration

For this monitor to work correctly, it must have permission to read the Service Management Automation database. Create a Run As Account that has read permissions to the database and add it to the Run As Profile called Microsoft Service Management Automation Database Account.

The following options can be configured on this monitor:

Option

Definition

Default

Alert On State

Health state for the monitor that generates an alert.

The monitor is in a critical health state

Alert Priority

Priority of the alert generated for this monitor.

Medium

Alert Severity

Priority of the alert generated for this monitor.

Critical

Auto-Resolve Alert

Specifies whether the alert should automatically be resolved when the monitor returns to a healthy state.

True

Enabled

Specifies whether the monitor should run.

True

ErrorThreshold

Number of messages in the queue that generates a critical health state.

20

Generates Alert

Specifies whether the monitor should generate an alert when changing to a warning or critical state.

True

Interval

Number of seconds between times that the monitor is run.

300

WarningThreshold

Number of messages in the queue that generates a warning health state.

10

Element properties:

TargetMicrosoft.SystemCenter.ServiceManagementAutomation.Server.RunbookWorker
Parent MonitorSystem.Health.PerformanceState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SystemCenter.ServiceManagementAutomation.MonitorType.MessageQueueLength
RemotableTrue
AccessibilityPublic
Alert Message
Job queue length has exceeded threshold.
Job Queue length has exceeded threshold. Last measured value is {0}.
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.SystemCenter.ServiceManagementAutomation.Monitor.MessageQueueLength" Accessibility="Public" Enabled="true" Target="Microsoft.SystemCenter.ServiceManagementAutomation.Server.RunbookWorker" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SystemCenter.ServiceManagementAutomation.MonitorType.MessageQueueLength" ConfirmDelivery="false">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.SystemCenter.ServiceManagementAutomation.Monitor.MessageQueueLength.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Value$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="UnderThreshold" MonitorTypeStateID="UnderThreshold" HealthState="Success"/>
<OperationalState ID="OverWarningThreshold" MonitorTypeStateID="OverWarningThreshold" HealthState="Warning"/>
<OperationalState ID="OverErrorThreshold" MonitorTypeStateID="OverErrorThreshold" HealthState="Error"/>
</OperationalStates>
<Configuration>
<Interval>300</Interval>
<DatabaseServer>$Target/Property[Type="SMA!Microsoft.SystemCenter.ServiceManagementAutomation.Server"]/DatabaseServerName$</DatabaseServer>
<DatabaseInstance>$Target/Property[Type="SMA!Microsoft.SystemCenter.ServiceManagementAutomation.Server"]/DatabaseServerInstance$</DatabaseInstance>
<DatabaseName>$Target/Property[Type="SMA!Microsoft.SystemCenter.ServiceManagementAutomation.Server"]/DatabaseName$</DatabaseName>
<WarningThreshold>10</WarningThreshold>
<ErrorThreshold>20</ErrorThreshold>
</Configuration>
</UnitMonitor>