Memory Utilization

Microsoft.HPC.2008R2.Monitor.WorkstationNode.Performance.MemoryUtilization (UnitMonitor)

Memory utilization performance monitor for HPC 2008 R2 Workstation Node

Knowledge Base article:

Summary

This monitor tracks the memory utilization on a compute node or a workstation node. The memory utilization is based on the corresponding Windows performance counter.

Configuration

This monitor is disabled by default. You can enable it and configure the memory threshold and the frequency of sampling.

Causes

Sustained high memory usage is usually caused by a memory intensive job occupying the node. It may cause the node hang or significantly impact the performance of the tasks running on that node.

Resolutions

To troubleshoot and fix this problem:

Element properties:

TargetMicrosoft.HPC.2008R2.WorkstationNode
Parent MonitorSystem.Health.PerformanceState
CategoryPerformanceHealth
EnabledFalse
Instance NameMemory
Counter Name\% Committed Bytes In Use
Frequency300
Alert GenerateTrue
Alert SeverityMatchMonitorHealth
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeSystem.Performance.ThresholdMonitorType
RemotableTrue
AccessibilityPublic
Alert Message
Memory Utilization has exceeded the upper threshold
Please see the alert context for details.
RunAsMicrosoft.HPC.RunAsProfile.AdminActionAccount

Source Code:

<UnitMonitor ID="Microsoft.HPC.2008R2.Monitor.WorkstationNode.Performance.MemoryUtilization" Accessibility="Public" Enabled="false" Target="Microsoft.HPC.2008R2.WorkstationNode" ParentMonitorID="Health!System.Health.PerformanceState" Remotable="true" Priority="Normal" RunAs="HPCLibrary!Microsoft.HPC.RunAsProfile.AdminActionAccount" TypeID="Performance!System.Performance.ThresholdMonitorType" ConfirmDelivery="false">
<Category>PerformanceHealth</Category>
<AlertSettings AlertMessage="Microsoft.HPC.2008R2.Monitor.WorkstationNode.Performance.MemoryUtilization_AlertMessageResourceID">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
</AlertSettings>
<OperationalStates>
<OperationalState ID="UnderThreshold" MonitorTypeStateID="UnderThreshold" HealthState="Success"/>
<OperationalState ID="OverThreshold" MonitorTypeStateID="OverThreshold" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<CounterName>% Committed Bytes In Use</CounterName>
<ObjectName>Memory</ObjectName>
<InstanceName/>
<AllInstances>false</AllInstances>
<Frequency>300</Frequency>
<Threshold>90</Threshold>
</Configuration>
</UnitMonitor>