The Subscriber Agent (Distribution, Log Reader, Merge, Queue Reader and Snapshot) is Retrying Monitor. Note that SQL Server Agent Windows Service is not supported by any edition of SQL Server Express. Therefore, this monitor is not applicable for SQL Server Express cases.
This monitor checks the Subscriber Agent (Distribution, Merge, Queue Reader and Snapshot) and counts consecutive failures of the agents. If the count exceeds the threshold, it creates an alert with a list of failing jobs. Note that SQL Server Agent Windows Service is not supported by any edition of SQL Server Express. Therefore, this monitor is not applicable for SQL Server Express cases.
The failure could be due to many reasons:
SQL agent failure.
Agent configuration issues, such as incorrect parameter values.
Network issue to prevent/slow down agent to access the Subscriber/Distributor.
Agent query timeout.
Open Replication Monitor or look at the agent history table/agent job history for any error message and investigate/address the errors accordingly.
Enable verbose agent logging and run the agent again to obtain detailed information
http://support.microsoft.com/kb/312292/
Name | Description | Default Value |
Alert Priority | Defines Alert Priority. | Normal |
Alert Severity | Defines Alert Severity. | Warning |
Enabled | Enables or disables the workflow. | Yes |
Failed jobs count threshold | Failed jobs count threshold | 1 |
Generates Alerts | Defines whether the workflow generates an Alert. | Yes |
Interval (seconds) | The recurring interval of time in seconds in which to run the workflow. | 300 |
Per-Job threshold | Per-Job threshold | 3 |
Synchronization Time | Synchronization Time |
|
Timeout (seconds) | Specifies the time the workflow is allowed to run before being closed and marked as failed. | 300 |
Timeout for database connection (seconds) | The workflow will fail and register an event, if it cannot access the database during the specified period. | 15 |
Target | Microsoft.SQLServer.2008.Replication.Subscriber | ||
Parent Monitor | System.Health.PerformanceState | ||
Category | PerformanceHealth | ||
Enabled | True | ||
Alert Generate | True | ||
Alert Severity | Warning | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | Microsoft.SQLServer.2008.Replication.MonitorType.DistributorFailJobs | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor |
<UnitMonitor ID="Microsoft.SQLServer.2008.Replication.Monitor.SubscriberAgentIsRetryingMonitor" Accessibility="Public" Enabled="true" Target="MS2RD!Microsoft.SQLServer.2008.Replication.Subscriber" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.2008.Replication.MonitorType.DistributorFailJobs" ConfirmDelivery="false" RunAs="MSRL!Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.2008.Replication.Monitor.SubscriberAgentIsRetrying.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Warning</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/InstanceName$</AlertParameter1>
<AlertParameter2>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/ConnectionString$</AlertParameter2>
<AlertParameter3>$Data/Context/Property[@Name='DistributorFailJobs']$</AlertParameter3>
<AlertParameter4>$Data/Context/Property[@Name='Message']$</AlertParameter4>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Health" MonitorTypeStateID="Health" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<SqlTimeout>15</SqlTimeout>
<ConnectionString>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/ConnectionString$</ConnectionString>
<ThresholdCountOfFailsForJob>3</ThresholdCountOfFailsForJob>
<ThresholdCountOfFailedJobs>1</ThresholdCountOfFailedJobs>
<CategoryList>Distribution, LogReader, Merge, QueueReader, Snapshot</CategoryList>
<ExcludeCategoryList/>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>300</TimeoutSeconds>
</Configuration>
</UnitMonitor>