MSSQL on Linux: A SQL job failed to complete successfully

Microsoft.SQLServer.Linux.CollectionRule.Agent.A_SQL_job_failed_to_complete_successfully_1_5_Rule (Rule)

A SQL Server Agent Job Failed. The SQL Server Agent is responsible for running SQL Server tasks scheduled to occur at specific times or intervals as well as detecting specific conditions for which administrators have defined an action, such as alerting someone through pages or e-mail, or a task that will address the conditions. The SQL Server Agent is also used for running replication tasks defined by administrators. Note that this rule is disabled by default. Use overrides to enable it when necessary.

Knowledge Base article:

Summary

A SQL Server Agent job Failed. The SQL Server Agent is responsible for running SQL Server tasks scheduled to occur at specific times or intervals as well as detecting specific conditions for which administrators have defined an action, such as alerting someone through pages or e-mail, or a task that will address the conditions. The SQL Server Agent is also used for running replication tasks defined by administrators.

Configuration

This rule detects failures only for jobs that have event log notifications about job failures. To configure the job with the event log notification, you can do the following:

Resolutions

To troubleshoot this failure review the event associated with the alert to determine the specific jobs and job steps that failed. Also, check historical outcomes of the job to determine the last date the job worked. To the job execution history, you can do the following:

Also, check that the service account used by SQL Server Agent is a member of the Domain Users group. The LocalSystem account does not have network access rights, so if your jobs require resources across the network, or if you want to notify operators through e-mail or pagers, you must set the account the SQL Server Agent service runs under to be a member of the Domain Users group.

Overridable Parameters

Name

Description

Default Value

Enabled

Enables or disables the workflow.

No

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Priority

Defines Alert Priority.

1

Severity

Defines Alert Severity.

2

Synchronization Time

The synchronization time specified by using a 24-hour format. May be omitted.

 

Timeout (seconds)

Specifies the time the workflow is allowed to run before being closed and marked as failed.

200

Timeout for query execution (seconds)

The workflow will fail and register an event, if the query execution takes longer than the specified period.

60

Timeout for database connection (seconds)

The workflow will fail and register an event, if it cannot access the database during the specified period.

15

Element properties:

TargetMicrosoft.SQLServer.Linux.Agent
CategoryEventCollection
EnabledFalse
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
RemotableTrue
Alert Message
MSSQL on Linux: A SQL job failed to complete successfully
{2}
CommentMom2017ID='{8CCE3391-B79E-4182-922E-BB540ED8396E}';MOM2017GroupID={467ECC75-C5DA-42BD-955C-A73BBB51AF74}

Member Modules:

ID Module Type TypeId RunAs 
_F6DA1507_12AF_11D3_AB21_00A0C98620CE_ DataSource Microsoft.SQLServer.Linux.DataSource.EventCollectionFilteredAgent Default
GenerateAlert WriteAction System.Health.GenerateAlert Default

Source Code:

<Rule ID="Microsoft.SQLServer.Linux.CollectionRule.Agent.A_SQL_job_failed_to_complete_successfully_1_5_Rule" Target="SqlDiscL!Microsoft.SQLServer.Linux.Agent" Enabled="false" ConfirmDelivery="true" Remotable="true" Comment="Mom2017ID='{8CCE3391-B79E-4182-922E-BB540ED8396E}';MOM2017GroupID={467ECC75-C5DA-42BD-955C-A73BBB51AF74}">
<Category>EventCollection</Category>
<DataSources>
<DataSource ID="_F6DA1507_12AF_11D3_AB21_00A0C98620CE_" Comment="{F6DA1507-12AF-11D3-AB21-00A0C98620CE}" TypeID="Microsoft.SQLServer.Linux.DataSource.EventCollectionFilteredAgent">
<MachineName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</MachineName>
<NetbiosComputerName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/NetbiosComputerName$</NetbiosComputerName>
<InstanceName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</InstanceName>
<ConnectionString>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/ConnectionString$</ConnectionString>
<InstanceVersion>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Version$</InstanceVersion>
<InstanceEdition>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Edition$</InstanceEdition>
<SqlExecTimeoutSeconds>60</SqlExecTimeoutSeconds>
<SqlTimeoutSeconds>15</SqlTimeoutSeconds>
<TimeoutSeconds>200</TimeoutSeconds>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
<EventDisplayNumber>208</EventDisplayNumber>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="GenerateAlert" TypeID="Health!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>2</Severity>
<AlertMessageId>$MPElement[Name="Microsoft.SQLServer.Linux.CollectionRule.Agent.A_SQL_job_failed_to_complete_successfully_1_5_Rule.AlertMessage"]$</AlertMessageId>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</AlertParameter2>
<AlertParameter3>Event ID: $Data/Property[@Name='EventID']$. $Data/Property[@Name='Message']$</AlertParameter3>
</AlertParameters>
<Suppression>
<SuppressionValue>$Data/Params/Param[1]$</SuppressionValue>
</Suppression>
</WriteAction>
</WriteActions>
</Rule>