Database Status

Microsoft.SQLServer.Windows.Monitor.Database.DBStatus (UnitMonitor)

This monitor checks the status of the database as reported by Microsoft SQL Server.

Knowledge Base article:

Summary

This monitor checks the database status as reported by Microsoft SQL Server. Status check is done by running a query against the master database of the SQL instance that returns the database state. If you receive an alert from this monitor, an action is required in order to bring the database back to an operational state.

The monitor has the `Disable if Availability Group is offline` override. When this override is set to true and the Availability Group that hosts the database is unavailable, the monitor stops tracking the state of such a database.

Causes

An unhealthy monitor state indicates that the SQL Server database is unavailable. All database states except the ONLINE one will result in an unhealthy monitor state. The possible database states are as follows:

SQL Database State

Description

Health State

ONLINE

Database is available for access. The primary Filegroup is online, although the undo phase of recovery may not have been completed.

Healthy

OFFLINE

Database is unavailable. A database becomes offline by explicit user action and remains offline until additional user action is taken. For example, the database may be taken offline in order to move a file to a new disk. The database is then brought back online after the move has been completed.

Critical

RECOVERY PENDING

SQL Server has encountered a resource-related error during recovery. The database is not damaged, but files may be missing or system resource limitations may be preventing it from starting. The database is unavailable. Additional action by the user is required to resolve the error and let the recovery process be completed.

Critical

SUSPECT

At least the primary Filegroup is suspect and may be damaged. The database cannot be recovered during startup of SQL Server. The database is unavailable. Additional action by the user is required to resolve the problem.

Critical

EMERGENCY

User has changed the database and set the status to EMERGENCY. The database is in single-user mode and may be repaired or restored. The database is marked READ_ONLY, logging is disabled, and access is limited to members of the sysadmin fixed server role. EMERGENCY is primarily used for troubleshooting purposes. For example, a database marked as suspect can be set to the EMERGENCY state. This could permit the system administrator read-only access to the database. Only members of the sysadmin fixed server role can set a database to the EMERGENCY state.

Critical

RESTORING

One or more files of the primary Filegroup are being restored, or one or more secondary files are being restored offline. The database is unavailable.

Warning

RECOVERING

Database is being recovered. The recovering process is a transient state; the database will automatically become online if the recovery succeeds. If the recovery fails, the database will become suspect. The database is unavailable.

Warning

To find the exact state that was reported for the database, examine the context of the state change or alert. Note: By default, this monitor raises an alert only when its state becomes critical.

Resolutions

You can attempt to bring the database back online to resolve the issue.

External

Database State Definitions

Overridable Parameters

Name

Description

Default Value

Alert Priority

Defines Alert Priority.

Normal

Alert Severity

Defines Alert Severity.

Match monitor’s health

Enabled

Enables or disables the workflow.

Yes

Disable if Availability Group is offline

Instructs the monitor to stop tracking the database state if the Availability Group with the database is offline. The override does not affect databases that are not part of Availability Groups. This override is useful as it helps you prevent alert storming that may happen when working with SQL Server 2012 due to the specifics of its architecture. For higher versions of SQL Server, this override is not required.

No

Generates Alerts

Defines whether the workflow generates an Alert.

Yes

Interval (seconds)

The recurring interval of time in seconds in which to run the workflow.

300

Number of samples

Indicates how many times a measured value should breach a threshold before the state is changed.

2

Synchronization Time

The synchronization time specified by using a 24-hour format. May be omitted.

 

Timeout (seconds)

Specifies the time the workflow is allowed to run before being closed and marked as failed.

200

Timeout for query execution (seconds)

The workflow will fail and register an event, if the query execution takes longer than the specified period.

60

Timeout for database connection (seconds)

The workflow will fail and register an event, if it cannot access the database during the specified period.

15

Element properties:

TargetMicrosoft.SQLServer.Windows.Database
Parent MonitorSystem.Health.AvailabilityState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityMatchMonitorHealth
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServer.Windows.MonitorType.Database.DBStatus
RemotableTrue
AccessibilityPublic
Alert Message
MSSQL on Windows: Database is in offline/recovery pending/suspect/emergency state
Database "{2}" on SQL Server instance "{1}", computer "{0}" is offline/recovery pending/suspect/emergency.
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.SQLServer.Windows.Monitor.Database.DBStatus" Accessibility="Public" Enabled="true" Target="SqlDiscW!Microsoft.SQLServer.Windows.Database" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" TypeID="Microsoft.SQLServer.Windows.MonitorType.Database.DBStatus" ConfirmDelivery="false">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServer.Windows.Monitor.Database.DBStatus.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</AlertParameter1>
<AlertParameter2>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</AlertParameter2>
<AlertParameter3>$Target/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.Database"]/DatabaseName$</AlertParameter3>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Good" MonitorTypeStateID="Good" HealthState="Success"/>
<OperationalState ID="Bad" MonitorTypeStateID="Bad" HealthState="Error"/>
<OperationalState ID="NeitherGoodNorBad" MonitorTypeStateID="NeitherGoodNorBad" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<MachineName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/MachineName$</MachineName>
<NetbiosComputerName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/NetbiosComputerName$</NetbiosComputerName>
<InstanceName>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/InstanceName$</InstanceName>
<DatabaseName>$Target/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.Database"]/DatabaseName$</DatabaseName>
<ConnectionString>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/ConnectionString$</ConnectionString>
<InstanceVersion>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Version$</InstanceVersion>
<InstanceEdition>$Target/Host/Property[Type="SqlCoreLib!Microsoft.SQLServer.Core.DBEngine"]/Edition$</InstanceEdition>
<MonitoringType>$Target/Host/Property[Type="SqlDiscW!Microsoft.SQLServer.Windows.DBEngine"]/MonitoringType$</MonitoringType>
<DisableIfAgOffline>false</DisableIfAgOffline>
<NumSamples>2</NumSamples>
<SqlExecTimeoutSeconds>60</SqlExecTimeoutSeconds>
<SqlTimeoutSeconds>15</SqlTimeoutSeconds>
<TimeoutSeconds>200</TimeoutSeconds>
<IntervalSeconds>300</IntervalSeconds>
<SyncTime/>
</Configuration>
</UnitMonitor>