Check node status request failed

Check_node_status_request_failed_1_Rule (Rule)

Knowledge Base article:

Management Pack
Summary

This alert is generated when the Microsoft Compute Cluster Management service on the head node fails while processing a request from a compute node.

 
Causes

This error can occur when this service fails while communicating with the Microsoft Compute Cluster SDM Store service, or when the change that is being performed by the management service conflicts with another change request.

Each compute node in the cluster checks with the head node every 5 minutes to determine whether jobs are available or other communication (such as configuration changes) is ready.

 
Resolutions
  1.  Usually, this problem will resolve itself within 30 minutes if the error was caused by conflict with another change transaction. If status checks do not succeed after 30 minutes, the head node rolls back pending transactions and releases any resources they are holding.
  2. Additional exception information in the event error message can help to identify the cause of failure. 

For information and resources to help resolve the underlying problem, see the Compute Cluster Server product documentation and online resources such as the Microsoft Windows HPC Community Web site ( http://go.microsoft.com/fwlink/?LinkId=76040).

 
© 2006 Microsoft Corporation, all rights reserved.

Element properties:

TargetMicrosoft.Windows.Server.ComputeCluster.2003.Microsoft_Windows_Compute_Cluster_Server_2003_Head_Nodes_Installation
CategoryEventCollection
EnabledTrue
Event_ID6106
Event SourceCCPManagement
Alert GenerateTrue
Alert SeverityError
Alert PriorityLow
RemotableTrue
Alert Message
Check node status request failed

$Data/EventDescription$
Event LogApplication
CommentMom2005ID='{4AB51879-EB58-4264-81D7-941C3D40E6F7}';MOM2005ComputerGroupID={3326B24C-3FCA-4A20-A242-10F6836AF625}

Member Modules:

ID Module Type TypeId RunAs 
_F6DA1507_12AF_11D3_AB21_00A0C98620CE_ DataSource Microsoft.Windows.EventProvider Default
CollectEventData WriteAction Microsoft.SystemCenter.CollectEvent Default
CollectEventDataWarehouse WriteAction Microsoft.SystemCenter.DataWarehouse.PublishEventData Default
GenerateAlert WriteAction System.Mom.BackwardCompatibility.AlertResponse Default

Source Code:

<Rule ID="Check_node_status_request_failed_1_Rule" Target="Microsoft.Windows.Server.ComputeCluster.2003.Microsoft_Windows_Compute_Cluster_Server_2003_Head_Nodes_Installation" Enabled="true" ConfirmDelivery="true" Comment="Mom2005ID='{4AB51879-EB58-4264-81D7-941C3D40E6F7}';MOM2005ComputerGroupID={3326B24C-3FCA-4A20-A242-10F6836AF625}">
<Category>EventCollection</Category>
<DataSources>
<DataSource ID="_F6DA1507_12AF_11D3_AB21_00A0C98620CE_" Comment="{F6DA1507-12AF-11D3-AB21-00A0C98620CE}" TypeID="WindowsLibrary!Microsoft.Windows.EventProvider">
<ComputerName>$Target/Host/Property[Type="WindowsLibrary!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<LogName>Application</LogName>
<Expression>
<And>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="Integer">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value>6106</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="String">PublisherName</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value>CCPManagement</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
</And>
</Expression>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="GenerateAlert" TypeID="MomBackwardCompatibility!System.Mom.BackwardCompatibility.AlertResponse">
<AlertGeneration>
<GenerateAlert>true</GenerateAlert>
<Owner/>
<Description>
$Data/EventDescription$
</Description>
<AlertLevel>40</AlertLevel>
<ResolutionState/>
<Source>
$Data/PublisherName$
</Source>
<Name>Check node status request failed</Name>
</AlertGeneration>
<InvokerType>0</InvokerType>
</WriteAction>
<WriteAction ID="CollectEventData" TypeID="SystemCenterLibrary!Microsoft.SystemCenter.CollectEvent"/>
<WriteAction ID="CollectEventDataWarehouse" TypeID="DataWarehouseLibrary!Microsoft.SystemCenter.DataWarehouse.PublishEventData"/>
</WriteActions>
</Rule>