Monitor REC_FAILED_HOST_IO_CARD (150)

A host I/O card
has failed and the alternate controller is locked down. This enumeration
value is numerically equivalent to REC_HOST_BOARD_FAULT. It has
been added so as to make the failure recovery terminology line up
with the major event log terminology. The two enumeration values may
be used interchangeably.

Failed I/OHost Card

What Caused the Problem?

An I/O host card in one of the controllers is not functioning properly. The Recovery Guru Details area provides specific information you will need as you follow the recovery steps.

Caution: Electrostatic discharge can damage sensitive components. Always use proper antistatic protection when handling components. Touching components without using a proper ground may damage the equipment.

Important Notes

You will need to replace the controller that has the failed host card. The failed controller is listed in the
Component requiring service
field in the Details area.
To ensure a complete configuration restore (both disk pool and traditional volume group), it is highly recommended that storage array configuration data is backed up prior to executing this procedure. This is especially important for simplex storage arrays, and controllers that operate without the use of batteries. To save your configuration, open either the Command Line Interface (CLI), or the Script Editor from the Enterprise Management Window (EMW), and execute the following command:
save storageArray dbmDatabase sourceLocation=onboard controller[a] contentType=all file="hostfile.zip";

Recovery Steps

If...

Then...

Your storage array has one controller

Go to

Procedure for Storage Arrays with One Controller.

Your storage array has two controllers

If there are any hosts connected to this storage array that are NOT running a host-based, multi-path failover driver, stop I/O to the storage array from each of these hosts.

Go to Procedure for Storage Arrays with Two Controllers.

Procedure for Storage Arrays with One Controller

Check the replacement part number of the affected controller to ensure that the new controller has the same replacement part number.

On the

Hardware

tab in the Array Management Window (AMW), select the affected controller.

Identify the "Replacement part number" in the Properties pane.

If...	Then...
The replacement controller has the same part number	Go to step 2.
The replacement controller does NOT have the same part number	Do not continue with the remaining recovery steps and contact your Technical Support Representative.

Stop all I/O to this storage array.

Turn off power to all power-fan canisters in the tray containing the failed controller.

Remove the affected controller. Refer to the Enterprise Management Window to view which management method you are using to manage this storage array.

If...

Then...

You are using In-Band management for ALL hosts attached to this storage array

Go to step 5.

You are using Out-of-Band management for ANY host attached to this storage array

Before you insert a new controller canister into the storage array, you must update the DHCP/BOOTP server for each Out-of-Band managed host so that it will associate the new controller's hardware Ethernet (MAC) address with the DNS/network name and IP address previously assigned to the removed controller.

To update the DHCP/BOOTP server, find the entry associated with the removed controller and replace its Ethernet (MAC) address with the new controller's Ethernet (MAC) address. The controller's Ethernet (MAC) address is located on an Ethernet ID label on the controller canister in the form xx.xx.xx.xx.xx.xx.

When you are finished, go to step 5.

If necessary, insert the battery from the old controller canister into the new replacement controller canister. Make sure at least 1 minute has elapsed and then insert the new (compatible) controller canister firmly into place.

Turn on power to all power-fan canisters in the tray. Wait until all drives have completed the spin-up process, and then go to step 7.

On the

Hardware

tab in the AMW, select the affected controller and view the status of the controller in the Properties pane.

Click the

Recheck

button to rerun the Recovery Guru. The failure should no longer appear in the Summary area. If the failure appears again, contact your Technical Support Representative.

Procedure for Storage Arrays with Two Controllers

Place the affected controller offline.

Select the controller on the

Hardware

tab in the Array Management Window.

Select the

Hardware > Controller > Advanced > Place > Offline

menu option.

Follow the instructions in the dialog, then click the

Yes

button.

Read all of the following steps before taking any action. The remaining recovery steps will no longer be accessible from the Recovery Guru dialog after you complete step a.

Click the

Recheck

button to rerun the Recovery Guru.

Select the "Offline Controller" problem that is being reported in the Summary area.

Complete the recovery steps in the "Offline Controller" recovery procedure to replace the affected controller.

NetAppESeries.FailureID_0150_Monitor (UnitMonitor)

Knowledge Base article:

Failed I/OHost Card

Element properties:

Source Code: