MECM Site server inbox schedule.box jobs backlog Monitor

MECM.SiteServer.SchedulerBacklog.NumberOfJobs.PerfThreshold.Monitor (UnitMonitor)

This monitor checks backlog of schedule.box jobs on site server. It raises alert if the backlog exceeds the threshold.

Knowledge Base article:

Summary

The total number of site server scheduler jobs that is queued has exceeded the standard operational threshold. The site server scheduler manages data transfer between sites, so large backlogs mean information is not being processed to send through the site hierarchy. If this condition is caused by isolated events such as a large software distribution package being sent to another site, the problem can resolve on its own if there is sufficient processing time. If the condition is caused by repeated events, such as large numbers of clients sending resynchronized inventory, the problem is not resolved until you fix the root cause. If the queued scheduler jobs count continues beyond the threshold, the length of time that is required to complete site-to-site communication increases. This could result in delays in processing software distribution to child sites or in refreshing inventory information about parent sites.

Causes

The site server scheduler job messages can exceed the threshold because:

Resolutions

To resolve this issue and lower the number of queued scheduler jobs:

Additional

If this alert is generating too many false positives, you can modify the threshold tab of the rule properties to designate values more appropriate for your environment.

This alert is best handled by Configuration Manager administrators.

Related Events:

You can also look for alerts that relate to the SMS_Executive service being stopped or the site server has insufficient resources, such as CPU or memory. These alerts are the potential cause of the site server scheduler jobs alert. Depending on the cause, it is likely that other types of backlogs such as scheduler send requests on the site server are also increasing, and corresponding alerts are occurring. If the backlog is due to sender problems, you might see sender-related alerts.

Element properties:

TargetMECM.SiteServerRoleBaseClass
Parent MonitorSystem.Health.PerformanceState
CategoryPerformanceHealth
EnabledTrue
Instance NameSMS Scheduler
Counter NameNumber of Jobs
Frequency900
Alert GenerateTrue
Alert SeverityMatchMonitorHealth
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeSystem.Performance.ConsecutiveSamplesThreshold
RemotableTrue
AccessibilityPublic
Alert Message
MECM Site server inbox schedule.box jobs backlog alert

Instance {0}
Object {1}
Counter {2}
Has a value {3}
At time {4}
RunAsDefault

Source Code:

<UnitMonitor ID="MECM.SiteServer.SchedulerBacklog.NumberOfJobs.PerfThreshold.Monitor" Accessibility="Public" Enabled="true" Target="MECM.SiteServerRoleBaseClass" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Perf!System.Performance.ConsecutiveSamplesThreshold" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="MECM.SiteServer.SchedulerBacklog.NumberOfJobs.PerfThreshold.Monitor.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/InstanceName$</AlertParameter1>
<AlertParameter2>$Data/Context/ObjectName$</AlertParameter2>
<AlertParameter3>$Data/Context/CounterName$</AlertParameter3>
<AlertParameter4>$Data/Context/SampleValue$</AlertParameter4>
<AlertParameter5>$Data/Context/TimeSampled$</AlertParameter5>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="ConditionFalse" MonitorTypeStateID="ConditionFalse" HealthState="Success"/>
<OperationalState ID="ConditionTrue" MonitorTypeStateID="ConditionTrue" HealthState="Error"/>
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<CounterName>Number of Jobs</CounterName>
<ObjectName>SMS Scheduler</ObjectName>
<InstanceName>_Total</InstanceName>
<AllInstances>false</AllInstances>
<Frequency>900</Frequency>
<Threshold>10000</Threshold>
<Direction>greater</Direction>
<NumSamples>12</NumSamples>
</Configuration>
</UnitMonitor>