MSSQL on Windows: Availability Replica Role Changed

Microsoft.SQLServer.Windows.EventRule.AvailabilityReplica.RoleChanged (Rule)

This error occurs when Availability replica changes its role.

Knowledge Base article:

Summary

This error occurs when Availability replica changes its role.

This event is disabled in SQL Server by default. It can be enabled by using the following TSQL: sp_altermessage 19406, 'with_log', 'true'

Causes

The replica state changed because of a startup, failover, communication issue, or a cluster error. See the event for additional information.

Resolutions

If the “changed to” state is PRIMARY_PENDING, then check sys.dm_hadr_database_replica_states. If database_state_desc = RECOVERY_PENDING (synchronization_health_desc will be NOT_HEALTHY), then try “ALTER DATABASE db SET HADR RESUME;”. Else if this is the only replica (no secondary replica), then (consider first taking a database snapshot as a backup if required) try “ALTER DATABASE db SET HADR OFF;” to remove Always On so as to manually recover the database with SQL service or to restart the database (ALTER DATABASE ONLINE) upon that.

TSQL: ALTER DATABASE DbName SET HADR RESUME;

TSQL: ALTER DATABASE DbName SET ONLINE;

TSQL: RESTORE DATABASE DbName WITH RECOVERY;

If the “changed to” state is RESOLVING_NORMAL, check for additional messages.

If “change to” state is PRIMARY_NORMAL / SECONDARY_NORMAL, then this may indicate a successful failover. Check for additional messages if the failover was not expected.

Overridable Parameters

Name

Description

Default Value

Enabled

Enables or disables the workflow.

Yes

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Priority

Defines Alert Priority.

1

Severity

Defines Alert Severity.

1

Synchronization Time

The synchronization time specified by using a 24-hour format. May be omitted.

 

Timeout (seconds)

Specifies the time the workflow is allowed to run before being closed and marked as failed.

200

Timeout for query execution (seconds)

The workflow will fail and register an event, if the query execution takes longer than the specified period.

60

Timeout for database connection (seconds)

The workflow will fail and register an event, if it cannot access the database during the specified period.

15

Element properties:

TargetMicrosoft.SQLServer.Windows.AvailabilityReplica
CategoryEventCollection
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
RemotableTrue
Alert Message
MSSQL on Windows: Availability Replica role changed
{2}

Member Modules:

ID Module Type TypeId RunAs 
_F6DA1507_12AF_11D3_AB21_00A0C98620CE_ DataSource Microsoft.SQLServer.Windows.DataSource.EventReaderSingleParam Default
GenerateAlert WriteAction System.Health.GenerateAlert Default

Source Code:

<Rule ID="Microsoft.SQLServer.Windows.EventRule.AvailabilityReplica.RoleChanged" Target="SqlDiscW!Microsoft.SQLServer.Windows.AvailabilityReplica" Enabled="true" ConfirmDelivery="true" Remotable="true">
<Category>EventCollection</Category>
<DataSources>
<DataSource ID="_F6DA1507_12AF_11D3_AB21_00A0C98620CE_" Comment="{F6DA1507-12AF-11D3-AB21-00A0C98620CE}" TypeID="Microsoft.SQLServer.Windows.DataSource.EventReaderSingleParam">
<MachineName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</MachineName>
<NetbiosComputerName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/NetbiosComputerName$</NetbiosComputerName>
<InstanceName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</InstanceName>
<ConnectionString>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/ConnectionString$</ConnectionString>
<InstanceVersion>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Version$</InstanceVersion>
<InstanceEdition>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Edition$</InstanceEdition>
<MonitoringType>$Target/Host/Property[Type="SqlDiscW!Microsoft.SQLServer.Windows.DBEngine"]/MonitoringType$</MonitoringType>
<FilterMsg>The state of the local availability replica</FilterMsg>
<ParamRegex>^The state of the local availability replica in (availability group '.+') has changed from '[^\s']+' to '[^\s']+'\.(?:[^']+)$</ParamRegex>
<TargetKey>availability group '$Target/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.AvailabilityReplica"]/AvailabilityGroupName$'</TargetKey>
<SqlExecTimeoutSeconds>60</SqlExecTimeoutSeconds>
<SqlTimeoutSeconds>15</SqlTimeoutSeconds>
<TimeoutSeconds>200</TimeoutSeconds>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="GenerateAlert" TypeID="Health!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>1</Severity>
<AlertMessageId>$MPElement[Name="Microsoft.SQLServer.Windows.EventRule.AvailabilityReplica.RoleChanged.AlertMessage"]$</AlertMessageId>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</AlertParameter2>
<AlertParameter3>Event ID: 19406. $Data/Property[@Name='Message']$</AlertParameter3>
</AlertParameters>
<Suppression>
<SuppressionValue/>
</Suppression>
</WriteAction>
</WriteActions>
</Rule>