Subscriber Agent is Retrying

Microsoft.SQLServer.Replication.Windows.Monitor.SubscriberAgentIsRetryingMonitor (UnitMonitor)

The Subscriber Agent (Distribution, Log Reader, Merge, Queue Reader and Snapshot) is Retrying Monitor. Note that SQL Server Agent Windows service is not supported by any edition of SQL Server Express. Therefore, this monitor is not applicable for SQL Server Express cases.

Knowledge Base article:

Summary

This monitor checks the Subscriber Agent (Distribution, Merge, Queue Reader and Snapshot) and counts consecutive failures of the Agents. If the count exceeds the threshold, it creates an alert with a list of failing jobs. Note that SQL Server Agent Windows service is not supported by any edition of SQL Server Express. Therefore, this monitor is not applicable for SQL Server Express cases.

Causes

The failure could be due to many reasons:

Resolutions

Open Replication Monitor or look at the Agent history table/Agent job history for any error messages and investigate/address the errors accordingly.

External

Enable verbose Agent logging and run the Agent again to obtain detailed information

http://support.microsoft.com/kb/312292/

Overrideable Parameters

Name

Description

Default Value

Alert Priority

Defines Alert Priority.

Normal

Alert Severity

Defines Alert Severity.

Warning

Enabled

Enables or disables the workflow.

Yes

Failed jobs count threshold

Failed jobs count threshold.

1

Generates Alerts

Defines whether the workflow generates an Alert.

Yes

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Per-Job Threshold

Count of fails per-job threshold.

3

Synchronization Time

The synchronization time specified by using a 24-hour format. May be omitted.

 

Timeout (seconds)

Specifies the time the workflow is allowed to run before being closed and marked as failed.

200

Timeout for database connection (seconds)

The workflow will fail and register an event if it cannot access the database during the specified period.

15

Element properties:

TargetMicrosoft.SQLServer.Replication.Windows.Subscriber
Parent MonitorSystem.Health.PerformanceState
CategoryPerformanceHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServer.Replication.Windows.MonitorType.DistributorFailJobs
RemotableTrue
AccessibilityPublic
Alert Message
MSSQL on Windows Replication: Subscriber Agent is retrying.
The Subscriber (Name: '{0}', Server: '{1}') has detected {2} failed job(s). This can be caused by the Subscriber Agent retrying.
{3}
RunAsMicrosoft.SQLServer.Core.RunAs.Monitoring

Source Code:

<UnitMonitor ID="Microsoft.SQLServer.Replication.Windows.Monitor.SubscriberAgentIsRetryingMonitor" Accessibility="Public" Enabled="true" Target="SQLReplWD!Microsoft.SQLServer.Replication.Windows.Subscriber" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.Replication.Windows.MonitorType.DistributorFailJobs" ConfirmDelivery="false" RunAs="SqlCoreLib!Microsoft.SQLServer.Core.RunAs.Monitoring">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.Replication.Windows.Monitor.SubscriberAgentIsRetrying.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Warning</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/InstanceName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/ConnectionString$</AlertParameter2>
<AlertParameter3>$Data/Context/Property[@Name='DistributorFailJobs']$</AlertParameter3>
<AlertParameter4>$Data/Context/Property[@Name='Message']$</AlertParameter4>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Health" MonitorTypeStateID="Health" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<MachineName>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/MachineName$</MachineName>
<NetbiosComputerName>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/NetbiosComputerName$</NetbiosComputerName>
<InstanceName>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/InstanceName$</InstanceName>
<SqlTimeoutSeconds>15</SqlTimeoutSeconds>
<ConnectionString>$Target/Host/Property[Type='SqlCoreLib!Microsoft.SQLServer.Core.DBEngine']/ConnectionString$</ConnectionString>
<MonitoringType>$Target/Host/Property[Type="SqlDiscW!Microsoft.SQLServer.Windows.DBEngine"]/MonitoringType$</MonitoringType>
<ThresholdCountOfFailsForJob>3</ThresholdCountOfFailsForJob>
<ThresholdCountOfFailedJobs>1</ThresholdCountOfFailedJobs>
<CategoryList>Distribution, LogReader, Merge, QueueReader, Snapshot</CategoryList>
<ExcludeCategoryList/>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>200</TimeoutSeconds>
</Configuration>
</UnitMonitor>