This monitor checks the state of the Snapshot Agent services for all Publications on the Distributor.
The Replication Snapshot Agent is an executable that prepares snapshot files containing schema and data of published tables and database objects, stores the files in the snapshot folder, and records synchronization jobs in the distribution database.
A snapshot agent can fail due to of any of the following reasons:
No free space available in the snapshot folder.
Missing write permissions on the snapshot folder.
Inability to connect to the Publisher database.
This can be caused by the Publisher Agent’s account not being a member of db_owner fixed database role for the Distributor database.
The agent is unable to connect to the distribution server.
Query timeouts.
To resolve the agent issues, ensure the following:
Ensure the snapshot agent account has read/write permissions on the snapshot folder
Ensure there is enough free space available on the media for the snapshot folder
Ensure the Publisher Agent’s account is a member of db_owner fixed database role
Ensure the snapshot agent is not being blocked by any other process (SPID). Use Activity Monitor or sp_who2 stored procedure, sys.dm_exec_requests DMV and sys.dm_os_waiting_tasks.
The mssnapshot_agents and mssnapshot_history tables can be queried on the distribution database to get more information on the agent failure. Msrepl_errors also logs information about any agent failure messages.
http://technet.microsoft.com/library/ms146939.aspx
Name | Description | Default Value |
Alert Priority | Defines Alert Priority. | Normal |
Alert Severity | Defines Alert Severity. | Error |
Enabled | Enables or disables the workflow. | Yes |
Estimated Job Duration | Threshold which is being used to check the job schedule compliance | 15 |
Generates Alerts | Defines whether the workflow generates an Alert. | Yes |
Interval (seconds) | The recurring interval of time in seconds in which to run the workflow. | 300 |
Show Jobs with Unknown State | Include jobs with unknown state to the monitor output and alert context. Will affect health. | No |
Synchronization Time | Synchronization Time |
|
Timeout (seconds) | Specifies the time the workflow is allowed to run before being closed and marked as failed. | 300 |
Timeout for database connection (seconds) | The workflow will fail and register an event, if it cannot access the database during the specified period. | 15 |
Target | Microsoft.SQLServer.2016.Replication.Distributor | ||
Parent Monitor | System.Health.PerformanceState | ||
Category | PerformanceHealth | ||
Enabled | True | ||
Alert Generate | True | ||
Alert Severity | Error | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | Microsoft.SQLServer.2016.Replication.MonitorType.ReplicationAgentState | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor |
<UnitMonitor ID="Microsoft.SQLServer.2016.Replication.Monitor.SnapshotAgentState" Accessibility="Public" Enabled="true" Target="MS2RD!Microsoft.SQLServer.2016.Replication.Distributor" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.2016.Replication.MonitorType.ReplicationAgentState" ConfirmDelivery="false" RunAs="MSRL!Microsoft.SQLServer.Replication.Monitoring.RunAs.Monitor">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.2016.Replication.Monitor.SnapshotAgentState.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Property[@Name='Message']$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Health" MonitorTypeStateID="Health" HealthState="Success"/>
<OperationalState ID="Error" MonitorTypeStateID="Error" HealthState="Error"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<SqlTimeout>15</SqlTimeout>
<ConnectionString>$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericDistributor']/ConnectionString$</ConnectionString>
<XPathName>Snapshot$Target/Property[Type='MSRL!Microsoft.SQLServer.Replication.Library.GenericDistributor']/ConnectionString$</XPathName>
<ShowJobsWithUnknownState>false</ShowJobsWithUnknownState>
<EstimatedJobDuration>15</EstimatedJobDuration>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>300</TimeoutSeconds>
</Configuration>
</UnitMonitor>