This monitor checks for long running SQL Agent jobs. Note that this monitor is disabled by default. Use overrides to enable it when necessary.
Note that SQL Server Agent Service is not supported by any edition of SQL Server Express.
This monitor checks for long running SQL Agent jobs. A Warning or Error alert will appear if a job has been running for longer that the configured threshold. Note that this monitor is disabled by default. Use overrides to enable it when necessary.
Note that SQL Server Agent Linux Service is not supported by any edition of SQL Server Express.
By default, this monitor does not monitor jobs that have schedule type 'Start automatically when SQL Server Agent starts' because these jobs often run until SQL Agent stops (i.e. continuously). Usually, SQL Server Replication uses such jobs, but in some cases, jobs with the 'Start automatically when SQL Server Agent starts' schedule type may run for a relatively short interval. To monitor these jobs, override parameter 'Included continuously executed jobs' with a comma-delimited list of the job names. The job name in the list should meet requirements of one of the following identifier classes:
1) Regular:
can contain any charecter except the comma sign (,) and double quote sign (");
should not start or end with any of the white-space characters.
2) Delimited:
can contain any characters and should be delimited by double quotes
double quotes should be escaped by doubling them
Any name belonging to any of the classes above should be from 1 to 128 characters, excluding delimiter characters.
An unhealthy state is caused by a SQL Server Agent job that has been running longer than the defined threshold. This could indicate a problem with the job.
The SQL Server Agent is responsible for running SQL Server tasks that are scheduled at specific time periods or intervals, as well as detecting specific conditions for which administrators have defined an action, such as alerting someone through pages or email, or a task that will address the conditions. The SQL Server Agent is also used for running replication tasks defined by administrators.
To identify the job that caused the warning or error state, examine the context data for the state change or alert.
Check SQL Server Management Studio to identify what jobs are running. If these jobs are running longer than necessary, investigate them to find out why they are.
Use sp_help_jobactivity to see information about currently running jobs.
Alternatively if it is expected for some agent jobs to run for a long time:
Override the monitor to change the thresholds for this specific instance of SQL or all instances
Disable the monitor for this specific instance of SQL or all instances
Name | Description | Default Value |
Alert Priority | Defines Alert Priority. | Normal |
Alert Severity | Defines Alert Severity. | Match monitor’s health |
Critical Threshold (minutes) | The monitor will change its state to Critical if the value exceeds this threshold. Being between this threshold and the warning threshold (inclusive) will result in the monitor being in a warning state. | 120 |
Enabled | Enables or disables the workflow. | No |
Generates Alerts | Defines whether the workflow generates an Alert. | Yes |
Included continuously executed jobs | Some SQL Agent Jobs may run infinitely (until Agent stops). They usually have schedule type 'Start automatically when SQL Server Agent starts'. For example, SQL Server Replication often uses such jobs. These jobs lead to false alerts and by default monitor doesn't takes them in account. But there may be exclusions when such jobs run for a short time. In order to monitor such jobs one shoud define a list of these jobs' names delimited by comma. |
|
Interval (seconds) | The recurring interval of time in seconds in which to run the workflow. | 600 |
Synchronization Time | The synchronization time specified by using a 24-hour format. May be omitted. |
|
Timeout (seconds) | Specifies the time the workflow is allowed to run before being closed and marked as failed. | 300 |
Timeout for query execution (seconds) | The workflow will fail and register an event, if the query execution takes longer than the specified period. | 60 |
Timeout for database connection (seconds) | The workflow will fail and register an event, if it cannot access the database during the specified period. | 15 |
Warning Threshold (minutes) | Exceeding this threshold will result in the monitor changing to at least a Warning state. | 60 |
Target | Microsoft.SQLServer.Linux.Agent | ||
Parent Monitor | System.Health.PerformanceState | ||
Category | PerformanceHealth | ||
Enabled | False | ||
Alert Generate | True | ||
Alert Severity | MatchMonitorHealth | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | Microsoft.SQLServer.Linux.MonitorType.Agent.LongRunningJobs | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Default |
<UnitMonitor ID="Microsoft.SQLServer.Linux.Monitor.Agent.LongRunningJobs" Accessibility="Public" Enabled="false" Target="SqlDiscL!Microsoft.SQLServer.Linux.Agent" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.Linux.MonitorType.Agent.LongRunningJobs" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.Linux.Monitor.Agent.LongRunningJobs.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</AlertParameter2>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="UnderThreshold1" MonitorTypeStateID="UnderThreshold1" HealthState="Success"/>
<OperationalState ID="OverThreshold1UnderThreshold2" MonitorTypeStateID="OverThreshold1UnderThreshold2" HealthState="Warning"/>
<OperationalState ID="OverThreshold2" MonitorTypeStateID="OverThreshold2" HealthState="Error"/>
</OperationalStates>
<Configuration>
<MachineName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</MachineName>
<NetbiosComputerName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/NetbiosComputerName$</NetbiosComputerName>
<InstanceName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</InstanceName>
<ConnectionString>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/ConnectionString$</ConnectionString>
<InstanceVersion>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Version$</InstanceVersion>
<InstanceEdition>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Edition$</InstanceEdition>
<Threshold1>60</Threshold1>
<Threshold2>120</Threshold2>
<IncludedJobs/>
<SqlExecTimeoutSeconds>60</SqlExecTimeoutSeconds>
<SqlTimeoutSeconds>15</SqlTimeoutSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<IntervalSeconds>600</IntervalSeconds>
<SyncTime/>
</Configuration>
</UnitMonitor>