This monitor checks backlog of schedule.box jobs on site server. It raises alert if the backlog exceeds the threshold.
The total number of site server scheduler jobs that is queued has exceeded the standard operational threshold. The site server scheduler manages data transfer between sites, so large backlogs mean information is not being processed to send through the site hierarchy. If this condition is caused by isolated events such as a large software distribution package being sent to another site, the problem can resolve on its own if there is sufficient processing time. If the condition is caused by repeated events, such as large numbers of clients sending resynchronized inventory, the problem is not resolved until you fix the root cause. If the queued scheduler jobs count continues beyond the threshold, the length of time that is required to complete site-to-site communication increases. This could result in delays in processing software distribution to child sites or in refreshing inventory information about parent sites.
The site server scheduler job messages can exceed the threshold because:
The SMS_Executive service is stopped or not responding on the site server.
The site server scheduler jobs processing rate is slower than the incoming rate of jobs to be scheduled.
The site server has insufficient resources, such as CPU, memory, or disk space.
The administrator has sent a large package, or several packages, to one or more child sites.
The sender is not functioning properly between the two sites. This could be due to several reasons, such as network connectivity issues, account access problems, or sender Configuration errors.
The administrator has deleted a large number of inventory records, which triggered a large number of inventory resynchronization requests.
A large number of clients has attached to a new site, which triggered a large number of inventory resynchronization requests.
To resolve this issue and lower the number of queued scheduler jobs:
Verify that the threshold for this alert is not too low and is beyond the standard operating threshold for this specific site.
Verify that the SMS Executive service is running on the site server. If it is stopped or not responding, restart the service.
Verify that the sender can connect to the destination site.
Evaluate the processing rates for software distribution to child sites. If the baseline performance for typical package processing is unacceptable, add resources or upgrade to a more powerful computer.
Reduce the size of packages sent to child sites or send them less frequently.
Avoid actions that generate large amounts of traffic between sites, such as resynchronized inventory.
If this alert is generating too many false positives, you can modify the threshold tab of the rule properties to designate values more appropriate for your environment.
This alert is best handled by Configuration Manager administrators.
Related Events:
You can also look for alerts that relate to the SMS_Executive service being stopped or the site server has insufficient resources, such as CPU or memory. These alerts are the potential cause of the site server scheduler jobs alert. Depending on the cause, it is likely that other types of backlogs such as scheduler send requests on the site server are also increasing, and corresponding alerts are occurring. If the backlog is due to sender problems, you might see sender-related alerts.
Target | Microsoft.SystemCenter2012.ConfigurationManager.SiteServerRoleBaseClass | ||
Parent Monitor | System.Health.PerformanceState | ||
Category | PerformanceHealth | ||
Enabled | True | ||
Instance Name | SMS Scheduler | ||
Counter Name | Number of Jobs | ||
Frequency | 900 | ||
Alert Generate | True | ||
Alert Severity | MatchMonitorHealth | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | System.Performance.ConsecutiveSamplesThreshold | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Default | ||
Comment | SIV:SVC0020, CreatedByMyFriend at 10/15/2011 5:25:46 PM |
<UnitMonitor ID="Microsoft.SystemCenter2012.ConfigurationManager.Perf_Threshold_Site_server_inbox_schedule_box_jobs_backlog_monitor" Comment="SIV:SVC0020, CreatedByMyFriend at 10/15/2011 5:25:46 PM" Accessibility="Public" Enabled="onEssentialMonitoring" Target="SCCM!Microsoft.SystemCenter2012.ConfigurationManager.SiteServerRoleBaseClass" ParentMonitorID="SystemHealth!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="SystemPerf!System.Performance.ConsecutiveSamplesThreshold" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SystemCenter2012.ConfigurationManager.Perf_Threshold_Site_server_inbox_schedule_box_jobs_backlog_monitor_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/InstanceName$</AlertParameter1>
<AlertParameter2>$Data/Context/ObjectName$</AlertParameter2>
<AlertParameter3>$Data/Context/CounterName$</AlertParameter3>
<AlertParameter4>$Data/Context/Value$</AlertParameter4>
<AlertParameter5>$Data/Context/TimeSampled$</AlertParameter5>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="ConditionFalse" MonitorTypeStateID="ConditionFalse" HealthState="Success"/>
<OperationalState ID="ConditionTrue" MonitorTypeStateID="ConditionTrue" HealthState="Error"/>
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<CounterName>Number of Jobs</CounterName>
<ObjectName>SMS Scheduler</ObjectName>
<InstanceName>_Total</InstanceName>
<AllInstances>false</AllInstances>
<Frequency>900</Frequency>
<Threshold>10000</Threshold>
<Direction>greater</Direction>
<NumSamples>12</NumSamples>
</Configuration>
</UnitMonitor>