Load of Replication agents on Subscriber

Microsoft.SQLServer.2012.Replication.Monitor.SubscriberAgentsLoad (UnitMonitor)

Replication agents (Distribution and Merge) load on Subscriber

Knowledge Base article:

Summary

This Monitor calculates total run time of each replication agent (Distribution, Merge) on a Subscriber for the time period defined by the threshold. If any agent’s total run time exceeds the defined threshold, an alert is fired in the following format:

The Subscriber (Name: '{0}', Server: '{1}') has detected {2}% utilization.

Causes

Resolutions

Open the retrying agent job history for any error message, then investigate and address the errors accordingly.

Run sp_who2 when the Job runs and check if it is blocked by some other session. If it is blocked, resolve the blocking or wait for the other session to complete.

Ensure that the network latency between Distributor and Subscriber (distribution agent) is not too high.

Identify and investigate any slow running queries.

Ensure statistics are up to date by running sp_updatestats against the subscribing databases. ( http://msdn.microsoft.com/library/ms173804.aspx)

Ensure indexes on the subscribing database are not heavily fragmented.

Overrideable Parameters

Name

Description

Default Value

Alert Priority

Defines Alert Priority.

Normal

Alert Severity

Defines Alert Severity.

Error

Enabled

Enables or disables the workflow.

Yes

Error Threshold

Error Threshold

4

Generates Alerts

Defines whether the workflow generates an Alert.

Yes

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Measurement Period (hours)

Period of time used for measurement (hours)

24

Synchronization Time

Synchronization Time

 

Timeout (seconds)

Specifies the time the workflow is allowed to run before being closed and marked as failed.

300

Timeout for database connection (seconds)

The workflow will fail and register an event, if it cannot access the database during the specified period.

15

Warning Threshold

Warning Threshold

3

Element properties:

TargetMicrosoft.SQLServer.2012.Replication.Subscriber
Parent MonitorSystem.Health.PerformanceState
CategoryPerformanceHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServer.2012.Replication.MonitorType.DistributorAgentsLoad
RemotableTrue
AccessibilityPublic
Alert Message
MSSQL2012 Replication: The replication agents on the Subscriber are running under heavy load.
The Subscriber (Name: '{0}', Server: '{1}') has experienced a heavy load for {2} hours during the past 24 hours.
{3}
RunAsMicrosoft.SQLServer.Replication.Monitoring.RunAs.Monitor

Source Code:

<UnitMonitor ID="Microsoft.SQLServer.2012.Replication.Monitor.SubscriberAgentsLoad" Accessibility="Public" Enabled="true" Target="MS2RD!Microsoft.SQLServer.2012.Replication.Subscriber" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.2012.Replication.MonitorType.DistributorAgentsLoad" ConfirmDelivery="false" RunAs="MSRL!Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.2012.Replication.Monitor.SubscriberAgentsLoad.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/InstanceName$</AlertParameter1>
<AlertParameter2>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/ConnectionString$</AlertParameter2>
<AlertParameter3>$Data/Context/Property[@Name='LoadHours']$</AlertParameter3>
<AlertParameter4>$Data/Context/Property[@Name='Message']$</AlertParameter4>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Health" MonitorTypeStateID="Health" HealthState="Success"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
<OperationalState ID="Error" MonitorTypeStateID="Error" HealthState="Error"/>
</OperationalStates>
<Configuration>
<SqlTimeout>15</SqlTimeout>
<ConnectionString>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericSubscriber']/ConnectionString$</ConnectionString>
<WarningThreshold>3</WarningThreshold>
<ErrorThreshold>4</ErrorThreshold>
<LastHours>24</LastHours>
<CategoryList>Distribution, Merge</CategoryList>
<ExcludeCategoryList/>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>300</TimeoutSeconds>
</Configuration>
</UnitMonitor>