Memory Heap Usage

Summary

This monitor checks memory usage of ResourceManager service process. ResourceManager is vital for Yarn subsystem, so it is highly desirable to keep its parameters in acceptable ranges at all times.

Default monitor thresholds:

Warning: when percent of memory used is between 85% and 95%.
Error: when percent of memory used is 95% or higher.

HDInsight Appliance

Monitor is active and reports actual component state.

HDInsight Azure

This monitor is not available in HDInsight clusters on Azure, so diagnostic and resolution steps below do not apply to this type of environment.

Causes

Regularly, ResourceManager doesn’t have issues with running on low memory. ResourceManager memory usage can get increased if Yarn/MapReduce parameters are changed to non-optimal values and when it runs maximum number of jobs allowed by the system capacity.

Resolutions

To diagnose the issue:

Remotely connect to the head node and check Hadoop configuration. You should check files yarn-site.xml and mapred-site.xml which is located at <OS disk>:\hadoop\hadoop-<HDP version>\etc\hadoop and review configuration parameters

Connecting remotely to the head node is a two-step operation:

Use Remote Desktop Connection to login into secure node of the HDInsight cluster.
Use another Remote Desktop Connection from the secure node to connect to the head node virtual machine.

To resolve the issue:

If ResourceManager memory usage is caused by inappropriate Hadoop configuration, use different configuration values or revert yarn-site.xml to factory settings. Restart ResourceManager after changing the configuration.
If you are not able to resolve issue, please contact Microsoft Support team and provide them with alert name and details. Be aware that diagnostic action may require administrator permissions on HDInsight cluster.

Target

Microsoft.HDInsight.HostComponent.ResourceManager

Parent Monitor

System.Health.PerformanceState

Category

PerformanceHealth

Enabled

True

Alert Generate

True

Alert Severity

MatchMonitorHealth

Alert Priority

Normal

Alert Auto Resolve

True

Monitor Type

Microsoft.HDInsight.UnitMonitorType.HostComponentThreeStateThreshold

Remotable

True

Accessibility

Public

Alert Message

ResourceManager is working under high memory pressure.

ResourceManager is using {1} \% of its maximum heap memory in the cluster "{0}".

RunAs

Default

Microsoft.HDInsight.UnitMonitor.ResourceManagerMemoryHeapUsed (UnitMonitor)

Knowledge Base article:

Summary

Causes

Resolutions

Element properties:

Source Code: