Replication Agents failed on the Distributor.

Microsoft.SQLServer.2008.Replication.Monitor.ReplicationAgentFailJobs (UnitMonitor)

This monitor checks if the following Replication agent jobs are in a healthy state: Distribution agent, Merge agent, QueueReader agent, Log reader agent or Snapshot agent. If any of the agents are in a failed state, the monitor will be triggered.

Knowledge Base article:

Summary

This monitor checks the healthy state of the jobs for the follow replication agents: Distribution Agent, Merge Agent, QueueReader Agent, Log reader Agent and Snapshot Agent. If any of the agents’ jobs are in a failed state, the monitor will be triggered.

Causes

The Replication agent jobs can fail due to many reasons:

Resolutions

To resolve the issue try the following:

External

How to enable replication agents for logging to output files in SQL Server:

http://support.microsoft.com/kb/312292

Overrideable Parameters

Name

Description

Default Value

Enabled

Enables or disables the workflow

Yes

Generates Alerts

Defines whether the workflow generates an Alert

Yes

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Timeout (seconds)

Timeout (seconds)

300

Synchronization Time

Synchronization Time

 

Failed jobs count threshold

Failed jobs count threshold

1

Per-Job threshold

Per-Job threshold

1

Element properties:

TargetMicrosoft.SQLServer.2008.Replication.Distributor
Parent MonitorSystem.Health.PerformanceState
CategoryPerformanceHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServer.2008.Replication.MonitorType.DistributorFailJobs
RemotableTrue
AccessibilityPublic
Alert Message
MSSQL2008 Replication: Failed jobs detected on the Distributor.
The Distributor (Name: '{0}', Server: '{1}') has detected {2} failed job(s).
{3}
RunAsMicrosoft.SQLServer.Replication.Monitoring.RunAs.Monitor

Source Code:

<UnitMonitor ID="Microsoft.SQLServer.2008.Replication.Monitor.ReplicationAgentFailJobs" Accessibility="Public" Enabled="true" Target="MS2RD!Microsoft.SQLServer.2008.Replication.Distributor" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.2008.Replication.MonitorType.DistributorFailJobs" ConfirmDelivery="false" RunAs="MSRL!Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.2008.Replication.Monitor.ReplicationAgentFailJobs.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Warning</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericDistributor']/InstanceName$</AlertParameter1>
<AlertParameter2>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericDistributor']/ConnectionString$</AlertParameter2>
<AlertParameter3>$Data/Context/Property[@Name='DistributorFailJobs']$</AlertParameter3>
<AlertParameter4>$Data/Context/Property[@Name='Message']$</AlertParameter4>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Health" MonitorTypeStateID="Health" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<SqlTimeout>300</SqlTimeout>
<ConnectionString>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericDistributor']/ConnectionString$</ConnectionString>
<ThresholdCountOfFailsForJob>1</ThresholdCountOfFailsForJob>
<ThresholdCountOfFailedJobs>1</ThresholdCountOfFailedJobs>
<CategoryList>Distribution</CategoryList>
<ExcludeCategoryList/>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
</Configuration>
</UnitMonitor>