Monitor REC_LOST_REDUNDANCY_TRAY (33)

IBMStorageSubsystem.FailureID_0033_Monitor (UnitMonitor)

Monitor Description for (33)

Knowledge Base article:

Drive Enclosure - Loss of Path Redundancy

What Caused the Problem?

An enclosure with redundant drive loops (channels) has lost communication through one of its loops. The enclosure has only one loop available for I/O. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Loss of path redundancy can result from any of the following:

 Caution: Electrostatic discharge can damage sensitive components. Use a grounding wrist strap or other anti-static precautions before removing or handling components.

Important Notes

Recovery Steps

1

Fix any other problems reported by the Recovery Guru before attempting to fix this problem.

2

Look on the back of the drive enclosures and see if any of the amber by-pass LEDs are flashing on the In or Out port of the ESMs (note that this behavior is not available on all models of drive enclosures).

If...

Then...

An amber by-pass LED is flashing

The supported data rate of the SFP associated with the port is not compatible with the data rate switch setting on the drive enclosure (for example, a 2 Gb/s SFP is installed in an enclosure that is set to 4 Gb/s).

Remove the SFP and replace it with one that is compatible with the data rate switch setting on the drive enclosure. Go to step 9.

An amber by-pass LED is NOT flashing

Either the drive enclosure model doesn't support this LED flashing indication or all SFPs are compatible with the data rate setting on the drive enclosure. Go to step 3.

3

If...

Then...

The controllers for this storage subsystem are located in an enclosure containing both controllers and drives

If...

Then...

One of the controller canisters is removed

Reinsert the controller.

Go to step 9.

Both controller canisters are present

To locate the non-working channel, start with the controller canister that is associated with the working channel. Looking at the controller canisters from the back of the enclosure, Controller A is the left controller canister and is associated with channel 1. Controller B is the right controller canister and is associated with channel 2.

Go to step 4.

The controllers for this storage subsystem are located in an enclosure containing only controllers

To locate the non-working channel, start with the drive port in the controller enclosure that corresponds to the working channel (refer to the labels on the back of the controller enclosure if needed).

Go to step 4.

4

Trace the cable from the working channel to the ESM canister in the affected drive enclosure reported in the Recovery Guru Details area.

Caution: Do not disconnect any cables on the working channel. Doing so will cause data loss!

5

Locate the other ESM canister in the affected drive enclosure and trace the cables back to the port on a controller enclosure or the controller canister for the combination controller/drive enclosure. This is the non-working channel. When tracing the cables on the non-working channel, perform the following:

a

Check for loose or damaged cables. An amber loop bypass LED (In Bypass or Out Bypass) on the ESM will be lit if there is a connection problem between two enclosures.

b

Check for a loop data rate mismatch.

If...

Then...

The controllers for this storage subsystem are located in an enclosure containing both controllers and drives

Look at the ESM canisters on the non-working channel. If any of the ESM canisters have a switch to set the loop data rate, use the Storage Subsystem >> View >> Profile option and select the Enclosure tab to verify they are all set to the maximum data rate (for example, 1 Gb/s or 2 Gb/s).

The controllers for this storage subsystem are located in an enclosure containing only controllers

If the ESM canisters or drive channel port on the non-working channel has a switch to set the loop data rate, verify that they all are set to the same data rate (for example, 1 Gb/s or 2 Gb/s). Note that if the drive channel port or one of the ESM canisters on the loop does not have a switch to set the loop speed setting switch, all of the other data rate switches on the drive channel loop must be set to 1 Gb/s.

If...

Then...

There is a connection problem or a loop data rate mismatch

Correct it and go to step 6.

There is not a connection problem or loop data rate mismatch

Go to step 6.

6

Click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.

If...

Then...

The problem does not appear in the Summary area

You are finished with this procedure.

The problem continues to appear in the Summary area

Go to step 7.

7

Check the green power light on each ESM canister along the non-working channel. If it is off, then reseating the ESM canister on the non-working channel may clear the failure being reported.

Reseat the canister by removing it from the drive enclosure and then wait 10 seconds. Re-insert the canister firmly, wait another 40 seconds, then go to step 8.

8

Click the

Recheck

button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative.

Element properties:

TargetIBMStorageSubsystem.StorageSubsystem
Parent MonitorIBMStorageSubsystem.StorageSubsystemAvailability
CategoryCustom
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeIBMStorageSubsystem.FailureUnitMonitorType
RemotableTrue
AccessibilityInternal
Alert Message
Alert: REC_LOST_REDUNDANCY_TRAY
Alert Value: {0}
RunAsDefault
CommentMachine generated entity

Source Code:

<UnitMonitor ID="IBMStorageSubsystem.FailureID_0033_Monitor" Accessibility="Internal" Enabled="true" Target="IBMStorageSubsystem.StorageSubsystem" ParentMonitorID="IBMStorageSubsystem.StorageSubsystemAvailability" Remotable="true" Priority="Normal" TypeID="IBMStorageSubsystem.FailureUnitMonitorType" ConfirmDelivery="true" Comment="Machine generated entity">
<Category>Custom</Category>
<AlertSettings AlertMessage="IBMStorageSubsystem.REC_LOST_REDUNDANCY_TRAY_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Property[@Name='FailureDescription']$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="IBMStorageSubsystem.StateId585B1BA0E3FD7BF79C2C144302B8A660" MonitorTypeStateID="NoIssue" HealthState="Success"/>
<OperationalState ID="IBMStorageSubsystem.StateId28DFBD5653DC21B2740F63966D327FFB" MonitorTypeStateID="IssueFound" HealthState="Error"/>
</OperationalStates>
<Configuration>
<FailureID>33</FailureID>
<IntervalSeconds>59</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<Trace>0</Trace>
</Configuration>
</UnitMonitor>