Critical State

Microsoft.SQLServerAppliance.PDW.Cooling.DeviceStateMonitor.Critical (UnitMonitor)

This monitor detects if Cooling Device is in Critical state.

Knowledge Base article:

Summary

The cooling device is reporting a critical condition. The server vendor is warning this is dangerous and could permanently damage the system.

Treat this as a critical situation and address this situation immediately. If this situation is not addressed:

Causes

For more details, view the component's device_status property in the PDW Admin Console or query the sys.dm_pdw_component_health_status DMV.

Resolutions

Contact Microsoft Support and provide them with the alert name and details.

To diagnose this issue,

1) Look for other alerts related to cooling devices. A possible cause is a data center air conditioning or other cooling device failure.

2) Microsoft Support will require Windows administrator access to the PDW Management node in order to access the vendor’s diagnostic tool to verify the server’s cooling devices and temperature sensors.

To resolve this issue, Microsoft Support will contact the vendor’s hardware support team. Depending on the problem, the vendor might require physical access to the appliance to replace a cooling device.

Element properties:

TargetMicrosoft.SQLServerAppliance.PDW.Cooling.Device
Parent MonitorSystem.Health.AvailabilityState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServerAppliance.PDW.ComponentTwoStateType
RemotableTrue
AccessibilityPublic
Alert Message
Cooling device status is CRITICAL
Appliance Name: {0}
Node Name: {1}
Component: {2}
Component Details: https://{3}/Topology/NodeDetails/{4}?compId={5}
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.SQLServerAppliance.PDW.Cooling.DeviceStateMonitor.Critical" Accessibility="Public" Enabled="true" Target="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Cooling.Device" ParentMonitorID="Health!System.Health.AvailabilityState" TypeID="Microsoft.SQLServerAppliance.PDW.ComponentTwoStateType" Remotable="true" Priority="Normal" ConfirmDelivery="false">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServerAppliance.PDW.Cooling.DeviceStateMonitor.Critical.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Warning</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/ApplianceID$</AlertParameter1>
<AlertParameter2>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/NodeName$</AlertParameter2>
<AlertParameter3>$Target/Property[Type="System!System.Entity"]/DisplayName$</AlertParameter3>
<AlertParameter4>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/ApplianceNetworkAddress$</AlertParameter4>
<AlertParameter5>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/NodeID$</AlertParameter5>
<AlertParameter6>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/ID$</AlertParameter6>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Good" MonitorTypeStateID="Good" HealthState="Success"/>
<OperationalState ID="Bad" MonitorTypeStateID="Bad" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>900</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>600</TimeoutSeconds>
<ConnectionString>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/ApplianceID$</ConnectionString>
<NodeName>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/NodeName$</NodeName>
<GroupName>$Target/Property[Type="PDWLibrary!Microsoft.SQLServerAppliance.PDW.Component"]/GroupName$</GroupName>
<ComponentName>$Target/Property[Type="System!System.Entity"]/DisplayName$</ComponentName>
<MonitoredState>Critical</MonitoredState>
</Configuration>
</UnitMonitor>