Monitor REC_CACHE_BACKUP_DEVICE_FAILED (251)

NetAppESeries.FailureID_0251_Monitor (UnitMonitor)

A cache backup
device has failed.

Knowledge Base article:

Cache Backup Device Failed

What Caused the Problem?

A cache backup device has failed and you will need to replace it. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

 Caution: Electrostatic charges can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

 

Recovery Steps

Important: Replacing a cache backup device is considered an advanced recovery procedure. It requires you to remove and open the affected controller canister and replace the failed cache backup device. It is recommended that the procedure is performed by an on-site technician, or under the guidance of a Technical Support Representative. 

If...

Then...

Your storage array has one controller

Go to

Procedure for Storage Arrays with One Controller

.

Your storage array has two controllers

Go to

Procedure for Storage Arrays with Two Controllers

.

Procedure for Storage Arrays with One Controller

1

Stop all I/O from all hosts to this storage array. When the Cache Active LED on the controller is no longer blinking (may take several minutes), proceed to step 2.

 Caution: Risk of Data Loss. You must wait for the Cache Active LED to stop blinking to ensure that all cache has been written to disk.

2

Determine the appropriate capacity for the replacement cache backup device by viewing the

Hardware > Trays

tab in the Storage Array Profile or by clicking the

View Tray Components

link on the

Hardware

tab in the Array Management Window (AMW). Ensure you have a replacement device that is the same in capacity as the failed device. 

3

Click the

Save As

button in the Recovery Guru dialog to save the remaining steps to a file on your local workstation. The remaining recovery steps will no longer be accessible from the Recovery Guru dialog after you complete step 4.

4

Remove the controller canister that contains the affected cache backup device (identified in the Recovery Guru Details area).

5

Remove the failed cache backup device (identified in the Recovery Guru Details area) from the controller canister.

6

Insert an appropriate replacement cache backup device that is the same capacity as the one that failed.

7

Insert the controller canister securely into place. After the controller appears on the

Hardware

tab in the AMW, go to step 8.

Note: Write caching will be reinstated (if applicable for each volume) once the controller's battery is fully charged and has completed any required learn cycles (if applicable).

8

Click the

Recheck

button to rerun the Recovery Guru. The failure should no longer appear in the Recovery Guru Summary area. If the failure appears again, contact your Technical Support Representative.

Procedure for Storage Arrays with Two Controllers

1

If there are any hosts connected to this storage array that are NOT running a host-based, multi-path failover driver, stop I/O to the storage array from each of these hosts.

2

Place the affected controller offline.

a

Select the controller on the

Hardware

tab of the Array Management Window.

b

Select the

Hardware > Controller > Advanced > Place > Offline

menu option.

c

Follow the instructions in the dialog, then click the

Yes

button.

3

Determine the appropriate capacity for the replacement cache backup device by viewing the

Hardware > Trays

tab in the Storage Array Profile or by clicking the

View Tray Components

link on the

Hardware

tab in the Array Management Window (AMW). Ensure you have a replacement device that is the same in capacity as the failed device. 

4

Click the

Save As

button in the Recovery Guru dialog to save the remaining steps to a file on your local workstation. The remaining recovery steps will no longer be accessible from the Recovery Guru dialog after you complete step 5.

5

Click the

Recheck

button to rerun the Recovery Guru. There should be an "Offline Controller" problem reported in the Recovery Guru Summary area.

6

Follow the "Offline Controller" recovery steps until you have removed the controller. After you have removed the controller, do not continue with the "Offline Controller" recovery steps until you are instructed to do so in this procedure.

7

Remove the failed cache backup device (identified in the Details area) from the controller canister.

8

Insert an appropriate replacement cache backup device that is the same capacity as the one that failed.

9

Complete the remaining "Offline Controller" recovery steps.

Note: Write caching will be reinstated (if applicable for each volume) once the controller's battery is fully charged and has completed any required learn cycles (if applicable).

Element properties:

TargetNetAppESeries.StorageArray
Parent MonitorNetAppESeries.StorageArrayAvailability
CategoryCustom
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeNetAppESeries.FailureUnitMonitorType
RemotableTrue
AccessibilityInternal
Alert Message
Alert: REC_CACHE_BACKUP_DEVICE_FAILED
A cache backup
device has failed. Alert Value: {0}
RunAsDefault
CommentMachine generated entity

Source Code:

<UnitMonitor ID="NetAppESeries.FailureID_0251_Monitor" Accessibility="Internal" Enabled="true" Target="NetAppESeries.StorageArray" ParentMonitorID="NetAppESeries.StorageArrayAvailability" Remotable="true" Priority="Normal" TypeID="NetAppESeries.FailureUnitMonitorType" ConfirmDelivery="true" Comment="Machine generated entity">
<Category>Custom</Category>
<AlertSettings AlertMessage="NetAppESeries.REC_CACHE_BACKUP_DEVICE_FAILED_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Property[@Name='FailureDescription']$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="NetAppESeries.StateIdC541BB420958181F52D802659E424720" MonitorTypeStateID="NoIssue" HealthState="Success"/>
<OperationalState ID="NetAppESeries.StateIdE833114FCF6BC9FE2350B2B3EA74B3D8" MonitorTypeStateID="IssueFound" HealthState="Error"/>
</OperationalStates>
<Configuration>
<FailureID>251</FailureID>
<IntervalSeconds>59</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<Trace>0</Trace>
</Configuration>
</UnitMonitor>