Critical State

Microsoft.SQLServerAppliance.APS.HDI.HDInsight.AmbariAgentMonitor.Critical (UnitMonitor)

This monitor detects if Ambari Agent Component is in Critical state.

Knowledge Base article:

Summary

The Ambari Agent is reporting a critical failure. This could cause a cluster failover event.

Causes

This occurs when the Ambari Agent is in one of the following states:

For more details, view the event logs from the node by querying the APS DMV sys.dm_pdw_os_event_logs.

Resolutions

To diagnose this issue:

1) Review the health information provided by the HDInsight management pack.

2) Remote desktop into the specific HDInsight node, use a Windows administrator account to review the Hadoop file logs for potential exceptions.

To resolve this issue, attempt to bring this Ambari Agent back online. To do this, use the Configuration Manager (dwconfig.exe) on the PDW Management Node to review the state of the Ambari Agent and to restart all services.

If restarting all services does not bring the Ambari Agent back online, contact Microsoft support and provide them with the alert name and details. Microsoft Support will help you to understand the failure and help you to bring the node back online or into a healthy state.

To resolve this issue, contact Microsoft support and provide them with the information you have gathered. Microsoft Support will help you to understand the failure and help you to bring the Ambari Agent back online or into a healthy state.

Element properties:

TargetMicrosoft.SQLServerAppliance.APS.HDI.HDInsight.AmbariAgent
Parent MonitorSystem.Health.AvailabilityState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityWarning
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.SQLServerAppliance.APS.ComponentTwoStateType
RemotableTrue
AccessibilityPublic
Alert Message
Ambari Agent has CRITICAL status
Appliance Name: {0}
Node Name: {1}
Component: {2}
Component Details: https://{3}/Hdi/Health/NodeDetails/{4}?compId={5}
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.SQLServerAppliance.APS.HDI.HDInsight.AmbariAgentMonitor.Critical" Accessibility="Public" Enabled="true" Target="APSLibrary!Microsoft.SQLServerAppliance.APS.HDI.HDInsight.AmbariAgent" ParentMonitorID="Health!System.Health.AvailabilityState" TypeID="Microsoft.SQLServerAppliance.APS.ComponentTwoStateType" Remotable="true" Priority="Normal" ConfirmDelivery="false">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.SQLServerAppliance.APS.HDI.HDInsight.AmbariAgentMonitor.Critical.AlertMessage">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Warning</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/ApplianceID$</AlertParameter1>
<AlertParameter2>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/NodeName$</AlertParameter2>
<AlertParameter3>$Target/Property[Type="System!System.Entity"]/DisplayName$</AlertParameter3>
<AlertParameter4>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/ApplianceNetworkAddress$</AlertParameter4>
<AlertParameter5>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/NodeID$</AlertParameter5>
<AlertParameter6>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/ID$</AlertParameter6>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Good" MonitorTypeStateID="Good" HealthState="Success"/>
<OperationalState ID="Bad" MonitorTypeStateID="Bad" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>900</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>600</TimeoutSeconds>
<ConnectionString>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/ApplianceTdsAddress$</ConnectionString>
<NodeName>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/NodeName$</NodeName>
<GroupName>$Target/Property[Type="APSLibrary!Microsoft.SQLServerAppliance.APS.Component"]/GroupName$</GroupName>
<ComponentName>$Target/Property[Type="System!System.Entity"]/DisplayName$</ComponentName>
<MonitoredState>Critical</MonitoredState>
</Configuration>
</UnitMonitor>