This monitor checks for long running SQL Agent jobs.
Note that SQL Server Agent Windows Service is not supported by any edition of SQL Server Express; there is no appropriate discovered object. This monitor is disabled by default. Please use overrides to enable it when necessary.
This monitor checks for long running SQL Agent jobs. A warning or error alert will be raised if a job has been running for longer that the configured threshold.
Unhealthy state is caused by an SQL Server Agent job that has run longer than the defined threshold. This could indicate a problem with the job.
The SQL Server Agent is responsible for running SQL Server tasks scheduled to occur at specific times or intervals as well as detecting specific conditions for which administrators have defined an action, such as alerting someone through pages or e-mail, or a task that will address the conditions. The SQL Server Agent is also used for running replication tasks defined by administrators.
To identify the job that caused the warning or error state, examine the context data for the state change or alert.
Check SQL Server Management Studio to identify what jobs are running. If these jobs are running longer than necessary, investigate them to find out why they are.
Use sp_help_jobactivity to see information about currently running jobs.
Alternatively, if it is expected for some agent jobs to run for a long time:
Override the monitor to change the thresholds for this specific instance of SQL or all instances
Disable the monitor for this specific instance of SQL or all instances
Name | Description | Default Value |
Alert Priority | Defines Alert Priority. | Normal |
Alert Severity | Defines Alert Severity. | MatchMonitorHealth |
Enabled | Enables or disables the workflow. | No |
Generates Alerts | Defines whether the workflow generates an Alert. | Yes |
Interval (seconds) | This monitor uses a script to perform its monitoring of long running jobs. This is the interval (in seconds) between executions of that script. | 600 |
Lower Threshold (minutes) | The lower threshold (in minutes) for this monitor. By default, exceeding this threshold will result in the monitor changing to at least a warning state. | 60 |
Synchronization Time | The synchronization time specified by using a 24-hour format. May be omitted. |
|
Timeout (seconds) | The amount of time (in seconds) that the script is allowed to run. | 300 |
Upper Threshold (minutes) | The upper threshold (in minutes) for this monitor. By default, exceeding this threshold will result in the monitor changing to a critical state. Being between this threshold and the lower threshold (inclusive) will result (by default) in the monitor being in a warning state. | 120 |
Target | Microsoft.SQLServer.2008.Agent | ||
Parent Monitor | System.Health.PerformanceState | ||
Category | PerformanceHealth | ||
Enabled | False | ||
Alert Generate | True | ||
Alert Severity | MatchMonitorHealth | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | Microsoft.SQLServer.2008.AgentLongRunningJobsProvider | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Default |
<UnitMonitor ID="Microsoft.SQLServer.2008.Agent.LongRunningJobs" Accessibility="Public" Enabled="false" Target="SQL2008Core!Microsoft.SQLServer.2008.Agent" ParentMonitorID="SystemHealth!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.2008.AgentLongRunningJobsProvider" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.2008.Agent.LongRunningJobs.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Host/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type="SQL!Microsoft.SQLServer.ServerRole"]/InstanceName$</AlertParameter2>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="UnderThreshold1" MonitorTypeStateID="UnderThreshold1" HealthState="Success"/>
<OperationalState ID="OverThreshold1UnderThreshold2" MonitorTypeStateID="OverThreshold1UnderThreshold2" HealthState="Warning"/>
<OperationalState ID="OverThreshold2" MonitorTypeStateID="OverThreshold2" HealthState="Error"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>600</IntervalSeconds>
<SyncTime/>
<ConnectionString>$Target/Host/Property[Type="SQL!Microsoft.SQLServer.DBEngine"]/ConnectionString$</ConnectionString>
<Threshold1>60</Threshold1>
<Threshold2>120</Threshold2>
<TimeoutSeconds>300</TimeoutSeconds>
</Configuration>
</UnitMonitor>