Dell OM : Physical Disk is in Critical state

Dell.ManagedServer.Alert.4345 (Rule)

Knowledge Base article:

Summary

Physical Disk critical state alert

Causes

Physical Disk has generated critical alert. Probable causes and corresponding resolutions for this condition are:

Cause

Resolutions

Patrol Read found an uncorrectable media error on <physical disk>.

Backup your data from the disk. Start disk initialization and wait for it to complete, then restore the data from the backup to the disk.

A block on <physical disk> was punctured by the controller.

Back up the data from the disk. Start disk initialization and wait for it to complete, and then restore the data from a backup copy.

Hot spare <physical disk> SMART polling has failed.<args>

Verify the health of the disk assigned as a hot spare. Replace the disk and reassign the hot spare. Make sure the cables are attached securely. See the storage hardware documentation for more information on checking the cables.

Bad block table on <physical disk> is full. Unable to log block <logical block address >.

Replace the disk generating this message and restore from a backup copy. You may have lost data.

The rebuild of <physical disk> failed due to errors on the source physical disk.

Replace the source disk and restore data from backup.

The rebuild failed due to errors on the target <physical disk>.

Replace the target disk. If a rebuild does not automatically start after replacing the disk, then initiate the Rebuild task. You may need to assign the new disk as a hot spare to initiate the rebuild.

A bad disk block on <physical disk> cannot be reassigned during a write operation.

Replace the disk.

An unrecoverable disk media error occurred on <physical disk>.

Replace the disk.

Copyback failed from <physical disk> to <physical disk>.

Replace the disk and retry the operation.

Security subsystem errors detected for <physical disk>.

Verify the disk is a Secure Encrypted Disk and is not locked. If it is not, replace the disk with a Secure Encrypted Disk. See the storage hardware documentation for more information.

Power state change failed on <PD Name>. (from <state> to <state>)

Replace the physical disk and try again. Contact technical support if the issue persists.

The RAID Controller may not be able to read/write data to the physical disk drive indicated in the message. This may be due to a failure with the physical disk drive or because the physical disk drive was removed from the system.

Remove and re-insert the physical disk drive identified in the message and make sure the physical disk drive is inserted properly. If the issue persists, replace the physical disk drive.

A Clear task was being performed on a physical disk but the task was interrupted and did not complete successfully. The controller may have lost communication with the disk, the disk was removed, or the cables may be loose or defective.

Verify that the disk is present and not in a Failed state. Make sure the cables are attached securely. See the storage hardware documentation for more information on checking the cables. Restart the Clear task.

The <PCIe solid state device name> is in read-only mode.

Back up the data on the device, and contact your service provider for further instructions.

The <PCIe solid state device name> has turned off because the critical temperature threshold is exceeded.

Contact your service provider for further instructions.

<physical disk> rebuild has failed.

Replace the failed or corrupt disk, and then start the rebuild operation.

Resolutions

Additional information on this issue may be available. Launch the iDRAC Console to debug further.

Element properties:

TargetDell.ManagedServer
CategoryAlert
EnabledTrue
Event_ID4345
Event SourceLifeCycle Controller Log
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
RemotableTrue
Alert Message
Dell OM : Physical Disk is in Critical state
Event Description: {0}
Event LogSystem

Member Modules:

ID Module Type TypeId RunAs 
DS DataSource Microsoft.Windows.EventProvider Default
Alert WriteAction System.Health.GenerateAlert Default
WriteToDW WriteAction Microsoft.SystemCenter.DataWarehouse.PublishEventData Default

Source Code:

<Rule ID="Dell.ManagedServer.Alert.4345" Enabled="true" Target="DellManagedServer!Dell.ManagedServer" ConfirmDelivery="false" Remotable="true" Priority="Normal" DiscardLevel="100">
<Category>Alert</Category>
<DataSources>
<DataSource ID="DS" TypeID="Windows!Microsoft.Windows.EventProvider">
<ComputerName>$Target/Property[Type="DellManagedServer!Dell.ManagedServer"]/HostName$</ComputerName>
<LogName>System</LogName>
<Expression>
<And>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="UnsignedInteger">4345</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="String">PublisherName</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="String">LifeCycle Controller Log</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
</And>
</Expression>
</DataSource>
</DataSources>
<WriteActions>
<WriteAction ID="Alert" TypeID="Health!System.Health.GenerateAlert">
<Priority>1</Priority>
<Severity>2</Severity>
<AlertMessageId>$MPElement[Name="Dell.ManagedServer.Alert.4345.Rule"]$</AlertMessageId>
<AlertParameters>
<AlertParameter1>$Data/EventDescription$</AlertParameter1>
</AlertParameters>
<Suppression>
<SuppressionValue>$Data/EventDisplayNumber$</SuppressionValue>
<SuppressionValue>$Data/Channel$</SuppressionValue>
<SuppressionValue>$Data/PublisherName$</SuppressionValue>
<SuppressionValue>$Data/LoggingComputer$</SuppressionValue>
<SuppressionValue>$Data/EventCategory$</SuppressionValue>
<SuppressionValue>$Data/EventLevel$</SuppressionValue>
<SuppressionValue>$Data/UserName$</SuppressionValue>
<SuppressionValue>$Data/EventNumber$</SuppressionValue>
<SuppressionValue>$Data/EventDescription$</SuppressionValue>
</Suppression>
<Custom1/>
<Custom2/>
<Custom3/>
<Custom4/>
<Custom5/>
<Custom6/>
<Custom7/>
<Custom8/>
<Custom9/>
<Custom10/>
</WriteAction>
<WriteAction ID="WriteToDW" TypeID="SCDW!Microsoft.SystemCenter.DataWarehouse.PublishEventData"/>
</WriteActions>
</Rule>