Dell Remote Access : Server : Processor is in critical state

Dell.iDRAC7.SNMPTrap.2241 (Rule)

Knowledge Base article:

Summary

Processor critical state alert

Causes

Processor has generated critical alert. Probable causes and corresponding resolutions for this condition are:

Cause

Resolutions

CPU <number> has an internal error (IERR).

Review System Event Log and Operating System Logs. If the issue persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> has a thermal trip (over-temperature) event.

Review logs for fan failures, replace failed fans. If no fan failures are detected, check inlet temperature (if available) and reinstall processor heatsink.

CPU <number> has failed the built-in self-test (BIST).

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> is stuck in POST.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Reduce system configuration to minimum memory and remove all PCI devices. If system completes POST, update system BIOS. Re-install memory and PCI one component at a time to meet the original configuration.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> failed to initialize.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> configuration is unsupported.

Review product documentation for supported CPU configurations.

Unrecoverable CPU complex error detected on CPU <number>.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> initialization error detected.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> protocol error detected.

  • Check system and operating system logs for exceptions. If no exceptions are found continue.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU bus parity error detected.

Check system and operating system logs for exceptions. If no exceptions found, remove input power, reinstall the processor, then apply input power.

CPU bus initialization error detected.

  • Check system and operating system logs for exceptions. If no exceptions are found continue.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> machine check error detected.

  • Check system and operating system logs for exceptions. If no exceptions are found continue.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> voltage regulator module failed.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

The power input for CPU <number> voltage regulator module is lost.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

The power input for CPU <number> voltage regulator module is outside of range.

  • Turn system off and remove input power for one minute. Re-apply input power and turn system on.

  • Make sure the processor is seated correctly.

  • If the issue still persists, contact technical support. Refer to the product documentation to choose a convenient contact method.

CPU <number> voltage regulator module is incorrectly configure.

Review product documentation for proper configuration and installation procedures.

CPU <number> voltage regulator module is absent.

if removal was unintended, check presence and re-install .

Resolutions

Additional information on this issue may be available. Launch the DRAC or OMSA Console to debug further.

Element properties:

TargetDell.RemoteAccess.iDRAC7
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
RemotableTrue
Alert Message
Dell Remote Access : Server : Processor is in critical state
{0}

Member Modules:

ID Module Type TypeId RunAs 
DS DataSource Dell.SNMPTrap.DSMT Default
Alert WriteAction System.Health.GenerateAlert Default

Source Code:

<Rule ID="Dell.iDRAC7.SNMPTrap.2241" Enabled="true" Target="DAD!Dell.RemoteAccess.iDRAC7" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>AvailabilityHealth</Category>
<DataSources>
<DataSource ID="DS" TypeID="Dell.SNMPTrap.DSMT">
<IP>$Target/Property[Type="DAD!Dell.RemoteAccess.RAC"]/IPAddress$</IP>
<CommunityString>$Target/Property[Type="DAD!Dell.RemoteAccess.RAC"]/CommunityString$</CommunityString>
<AllTraps>false</AllTraps>
<OIDProps>
<OIDProp>.1.3.6.1.4.1.674.10892.5.3.2.1.0.2241</OIDProp>
</OIDProps>
<EventOriginId>$Target/Id$</EventOriginId>
<PublisherId>$Target/Id$</PublisherId>
<PublisherName>iDRAC</PublisherName>
<Channel>SnmpEvent</Channel>
<LoggingComputer/>
<EventNumber>2241</EventNumber>
<EventCategory>5</EventCategory>
<EventLevel>10</EventLevel>
<UserName/>
<Params/>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="Alert" TypeID="SystemHealth!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>2</Severity>
<AlertName/>
<AlertDescription/>
<AlertOwner/>
<AlertMessageId>$MPElement[Name="Dell.iDRAC7.SNMPTrap.2241.Rule"]$</AlertMessageId>
<AlertParameters>
<AlertParameter1>$Data/EventData/DataItem/Property[@Name="drsAlertMessage"]$</AlertParameter1>
</AlertParameters>
<Suppression>
<SuppressionValue>$Data/EventDisplayNumber$</SuppressionValue>
<SuppressionValue>$Data/Channel$</SuppressionValue>
<SuppressionValue>$Data/PublisherName$</SuppressionValue>
<SuppressionValue>$Data/LoggingComputer$</SuppressionValue>
<SuppressionValue>$Data/EventCategory$</SuppressionValue>
<SuppressionValue>$Data/EventLevel$</SuppressionValue>
<SuppressionValue>$Data/UserName$</SuppressionValue>
<SuppressionValue>$Data/EventNumber$</SuppressionValue>
<SuppressionValue>$Data/EventData/DataItem/Property[@Name="drsAlertMessageID"]$</SuppressionValue>
<SuppressionValue>$Data/EventData/DataItem/Property[@Name="drsAlertFQDD"]$</SuppressionValue>
<SuppressionValue>$Data/EventData/DataItem/Property[@Name="drsAlertCurrentStatus"]$</SuppressionValue>
<SuppressionValue>$Data/EventData/DataItem/Property[@Name="drsSystemServiceTag"]$</SuppressionValue>
</Suppression>
<Custom1>Alert Message ID = $Data/EventData/DataItem/Property[@Name="drsAlertMessageID"]$ </Custom1>
<Custom2>Alert Message = $Data/EventData/DataItem/Property[@Name="drsAlertMessage"]$ </Custom2>
<Custom3>Alert Status = $Data/EventData/DataItem/Property[@Name="drsAlertCurrentStatus"]$ </Custom3>
<Custom4>Alert Service Tag = $Data/EventData/DataItem/Property[@Name="drsSystemServiceTag"]$ </Custom4>
<Custom5>Alert FQDN = $Data/EventData/DataItem/Property[@Name="drsAlertFQDN"]$ </Custom5>
<Custom6>Alert FQDD = $Data/EventData/DataItem/Property[@Name="drsAlertFQDD"]$ </Custom6>
<Custom7/>
<Custom8/>
<Custom9/>
<Custom10/>
</WriteAction>
</WriteActions>
</Rule>