Monitor Description for (33)
What Caused the Problem?
An enclosure with redundant drive loops (channels) has lost communication through one of its loops. The enclosure has only one loop available for I/O. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.
Loss of path redundancy can result from any of the following:
An SFP is inserted into one of the ports on an ESM canister with a data rate that is not compatible with the data rate switch setting on the drive enclosure.
Faulty ESM canister (a separate problem should be reported)
Faulty SFP (a separate problem should be reported)
Removed controller canister on an enclosure containing both controllers and drives
Disconnected or faulty drive cable
Improperly seated ESM canister
An ESM canister that supports selectable loop data rates is set to a rate that is not compatible with other devices on the loop.
Caution: Electrostatic discharge can damage sensitive components. Use a grounding wrist strap or other anti-static precautions before removing or handling components.
Important Notes
Correct this failure as soon as possible. Although the storage subsystem is still operational, a level of path redundancy has been lost. If the remaining drive loop fails, all I/O to that enclosure will fail.
The Recovery Guru will report separate problems for any enclosures it cannot reach through the loop (channel) of the affected enclosure.
The Recovery Guru Details area reports the affected enclosure and the
working
channel over which it can communicate with the enclosure.
The amber bypass LED on the In or Out port of the ESM will glow if the associated port is not active in the drive loop.
The amber bypass LED on the In or Out port of the ESM will flash if the data rate of the SFP in the associated port is not compatible with the data rate switch setting on the drive enclosure (for example, a 2 Gb/s SFP is installed in an enclosure that is set to 4 Gb/s).
The green power light on the ESM canister will be off if the canister is not seated properly.
The Event Log may also report an Extended fibre channel link down (greater than one minute) event (event 1019) in reference to this problem. Be sure to note the presence of this event when speaking with a technical support representative.
Recovery Steps
1 | Fix any other problems reported by the Recovery Guru before attempting to fix this problem. | ||||||||||||||||
2 | Look on the back of the drive enclosures and see if any of the amber by-pass LEDs are flashing on the In or Out port of the ESMs (note that this behavior is not available on all models of drive enclosures).
| ||||||||||||||||
3 |
| ||||||||||||||||
4 | Trace the cable from the working channel to the ESM canister in the affected drive enclosure reported in the Recovery Guru Details area. Caution: Do not disconnect any cables on the working channel. Doing so will cause data loss! | ||||||||||||||||
5 | Locate the other ESM canister in the affected drive enclosure and trace the cables back to the port on a controller enclosure or the controller canister for the combination controller/drive enclosure. This is the non-working channel. When tracing the cables on the non-working channel, perform the following:
| ||||||||||||||||
6 | Click the Recheck button to rerun the Recovery Guru to ensure that the problem has been fixed.
| ||||||||||||||||
7 | Check the green power light on each ESM canister along the non-working channel. If it is off, then reseating the ESM canister on the non-working channel may clear the failure being reported. Reseat the canister by removing it from the drive enclosure and then wait 10 seconds. Re-insert the canister firmly, wait another 40 seconds, then go to step 8. | ||||||||||||||||
8 | Click the Recheck button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your technical support representative. |
Target | IBMStorageSubsystem.StorageSubsystem | ||
Parent Monitor | IBMStorageSubsystem.StorageSubsystemAvailability | ||
Category | Custom | ||
Enabled | True | ||
Alert Generate | True | ||
Alert Severity | Error | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | IBMStorageSubsystem.FailureUnitMonitorType | ||
Remotable | True | ||
Accessibility | Internal | ||
Alert Message |
| ||
RunAs | Default | ||
Comment | Machine generated entity |
<UnitMonitor ID="IBMStorageSubsystem.FailureID_0033_Monitor" Accessibility="Internal" Enabled="true" Target="IBMStorageSubsystem.StorageSubsystem" ParentMonitorID="IBMStorageSubsystem.StorageSubsystemAvailability" Remotable="true" Priority="Normal" TypeID="IBMStorageSubsystem.FailureUnitMonitorType" ConfirmDelivery="true" Comment="Machine generated entity">
<Category>Custom</Category>
<AlertSettings AlertMessage="IBMStorageSubsystem.REC_LOST_REDUNDANCY_TRAY_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Data/Context/Property[@Name='FailureDescription']$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="IBMStorageSubsystem.StateId585B1BA0E3FD7BF79C2C144302B8A660" MonitorTypeStateID="NoIssue" HealthState="Success"/>
<OperationalState ID="IBMStorageSubsystem.StateId28DFBD5653DC21B2740F63966D327FFB" MonitorTypeStateID="IssueFound" HealthState="Error"/>
</OperationalStates>
<Configuration>
<FailureID>33</FailureID>
<IntervalSeconds>59</IntervalSeconds>
<TimeoutSeconds>300</TimeoutSeconds>
<Trace>0</Trace>
</Configuration>
</UnitMonitor>