Monitor REC_DEGRADED_VOLUME (13)

NetAppESeries.FailureID_0013_Monitor (UnitMonitor)

A volume group has
transitioned to the degraded state due to one or more drive failures.

Knowledge Base article:

Degraded Volume

What Caused the Problem?

One or more drives have failed in a disk pool or volume group and the associated volumes have become degraded. The data on the volumes is still accessible; however, data may be lost if another drive in the same disk pool or volume group fails. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

 Caution: Possible loss of data accessibility. Do not remove a component when either (1) the Service Action (removal) Allowed (SAA) field in the Details area of this recovery procedure is NO (), or (2) the SAA LED on the affected component is OFF (note that some products do not have SAA LEDs). Removing a component while its SAA LED is OFF may result in temporary loss of access to your data. Refer to the following Important Notes for more detail.

 Caution: Electrostatic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

Recovery Steps

1

Check the Recovery Guru Details area to identify the failed drive(s).

2

Remove all failed drives associated with this disk pool or volume group (the fault indicator lights on the failed drives should be on).

Note: To determine the failed drives, select one of the degraded volumes (identified in the Details area) on the Storage and Copy Services tab in the AMW. Each failed drive will have an association dot underneath it.

3

Wait 30 seconds, then insert the new drives. The fault indicator light on the new drives may become lit for a short time (one minute or less).

Data reconstruction should begin on the new drive(s). Their fault indicator lights will go off and the activity indicator lights of the drives in the disk pool or volume group will start flashing. When the reconstruction starts, the disk pool or volume group's volume icons on the Storage and Copy Services tab in the AMW change to Operation in Progress, then to Optimal, as the volumes are reconstructed.

Notes:

  • If you are replacing a drive in a storage array that contains hot spares, drive reconstruction will start on the hot spare before you insert the new drive. The data on the replacement drive may not be reconstructed until after it has completed the process on the hot spare.

  • If reconstruction does not start within a few minutes, select the new drive; then, select the

    Hardware > Drive > Advanced > Manually Reconstruct

    menu option to start reconstruction on the drive.

  • To monitor reconstruction progress on the affected volumes, on the

    Storage and Copy Services

    tab in the AMW, select the reconstructing volume and view the progress in the Properties pane. Note that once the operation in progress has completed, the progress bar is no longer displayed in the Properties pane.

  • Replace only one drive at a time for each disk pool or volume group. Each drive should complete reconstruction before the next drive begins reconstruction.

  • Wait until the reconstruction is completed for all volumes before continuing.

4

Click the Recheck button to rerun the Recovery Guru. When ALL failed drives are replaced, then this failure should no longer appear in the Summary area. If the failure appears again after all failed drives have been replaced, contact your Technical Support Representative.

Element properties:

TargetNetAppESeries.StorageArray
Parent MonitorNetAppESeries.StorageArrayAvailability
CategoryCustom
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeNetAppESeries.FailureUnitMonitorType
RemotableTrue
AccessibilityInternal
Alert Message
Alert: REC_DEGRADED_VOLUME
A volume group has
transitioned to the degraded state due to one or more drive failures. Alert Value: {0}
RunAsDefault
CommentMachine generated entity

Source Code:

<UnitMonitor ID="NetAppESeries.FailureID_0013_Monitor" Accessibility="Internal" Enabled="true" Target="NetAppESeries.StorageArray" ParentMonitorID="NetAppESeries.StorageArrayAvailability" Remotable="true" Priority="Normal" TypeID="NetAppESeries.FailureUnitMonitorType" ConfirmDelivery="true" Comment="Machine generated entity">
<Category>Custom</Category>
<AlertSettings AlertMessage="NetAppESeries.REC_DEGRADED_VOLUME_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Property[@Name='FailureDescription']$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="NetAppESeries.StateId7702F2D7AF204700E3FD540244153ABA" MonitorTypeStateID="NoIssue" HealthState="Success"/>
<OperationalState ID="NetAppESeries.StateId4E71F12D1F1E33A761E45FD10ABFE3F1" MonitorTypeStateID="IssueFound" HealthState="Error"/>
</OperationalStates>
<Configuration>
<FailureID>13</FailureID>
<IntervalSeconds>59</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<Trace>0</Trace>
</Configuration>
</UnitMonitor>