Dell Server Storage Controller Health
Controller Unit Monitor
If Controller is in Warning state, causes/resolutions for this condition:
Cause | Resolutions |
The <Controller name> cache has been discarded. | Verify that the battery and memory are functioning properly. |
Single-bit ECC error limit exceeded on the <Controller name> DIMM. | Replace the controller, contact Dell technical support at support.dell.com. |
Single-bit ECC error on <Controller name>. | No response action is required. |
Single-bit ECC error on <Controller name>. | Replace the Dual In-line Memory Module (DIMM) to avoid data loss or data corruption. The DIMM is a part of the controller battery pack. Refer to hardware documentation for information on replacing the DIMM. |
The NVRAM has corrupted data on <Controller name>. | No response action is required as the controller is taking the required corrective action. However, If this message is generated often (such as during each reboot), then replace the controller. |
The <Controller name> NVRAM has corrupt data. | Replace the controller. |
<Controller name> SAS port report: <message> | Make sure the SAS cables attached to the enclosure/backplane are attached securely. Refer to the storage hardware documentation for more information on checking the cables. Contact technical support if the issue persists. |
Physical disks found missing from configuration during boot time on <Controller name>. | Shut down the system. Re-insert the removed physical disks and re-start the system. |
Previous configuration was found completely missing during time boot on <Controller name>. | Shut down the system. Re-insert the removed physical disks and re-start the system. |
The foreign configuration overflow has occurred on <Controller name>. | Import these foreign configurations in multiple attempts. |
Preserved cache detected on <controller name>. | Check for the existence of a foreign configuration and import if any. Check that enclosures are cabled correctly. |
Component is under a software RAID controller for which health computation is not supported. | Ignore the WARNING state. |
If Controller is in Critical state, causes/resolutions for this condition:
Cause | Resolutions |
An invalid SAS configuration has been detected on <Controller name>. Details: <error message> | Refer to the storage hardware documentation for information on correct cabling configurations. |
Multi-bit ECC error on <Controller name> DIMM. | Replace the Dual In-line Memory Module (DIMM). The DIMM is a part of the controller battery pack. Refer to the storage hardware documentation for information on replacing the DIMM. You may need to restore data from backup. |
Diagnostic message <message> from <Controller name> | Refer to the storage hardware documentation for more information on the diagnostic test. |
Single-bit ECC error. The <Controller name> DIMM is critically degraded. | Replace the DIMM immediately to avoid data loss or data corruption. The DIMM is a part of the controller battery pack. Refer to the storage hardware documentation for information on replacing the DIMM. |
Single-bit ECC error on <Controller name>. | Replace the DIMM immediately. The DIMM is a part of the controller battery pack. Refer to your hardware documentation for information on replacing the DIMM. |
<Controller name> SAS SMP communications error <args> | There may be a SAS topology error. See the hardware documentation for information on correct SAS topology configurations. There may be problems with the cables such as a loose connection or an invalid cabling configuration. See the Cables Attached Correctly section for more information on checking the cables. See the hardware documentation for information on correct cabling configurations. Verify that the firmware version is supported. |
<Controller name> SAS expander error: <args> | There may be a problem with the enclosure. Verify the health of the enclosure and its components. To verify the health of the enclosure, select the enclosure object in the tree view. The Health subtab displays a red X or yellow exclamation mark for enclosure components that are failed or degraded. See the enclosure documentation for more information. |
A configuration command could not be committed to disk on <Controller name> | Re-issue the failed configuration command or try with a different set of physical disks. Contact technical support if the problem persists. |
Component is under a software RAID controller for which health computation is not supported. | Ignore the CRITICAL state. |
Additional information on this issue may be available. Launch the iDRAC Console to debug further.
Target | Dell.ManagedServer.Storage.Controller |
Parent Monitor | System.Health.AvailabilityState |
Category | Custom |
Enabled | False |
Alert Generate | False |
Alert Auto Resolve | True |
Monitor Type | Dell.ManagedServer.ServerHealthCookDownUMT |
Remotable | True |
Accessibility | Public |
RunAs | Default |
<UnitMonitor ID="Dell.ManagedServer.Storage.ControllerHealth" Accessibility="Public" Enabled="false" Target="DellManagedServer!Dell.ManagedServer.Storage.Controller" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" TypeID="Dell.ManagedServer.ServerHealthCookDownUMT" Priority="Normal" ConfirmDelivery="false">
<Category>Custom</Category>
<OperationalStates>
<OperationalState ID="Success" MonitorTypeStateID="Success" HealthState="Success"/>
<OperationalState ID="Critical" MonitorTypeStateID="Error" HealthState="Error"/>
<OperationalState ID="Warning" MonitorTypeStateID="Warning" HealthState="Warning"/>
</OperationalStates>
<Configuration>
<IntervalSeconds>21600</IntervalSeconds>
<SyncTime/>
<TimeoutSeconds>1200</TimeoutSeconds>
<InstanceIndex>$Target/Property[Type="DellManagedServer!Dell.ManagedServer.Storage.Controller"]/DeviceID$</InstanceIndex>
<ComponentType>Dell.ManagedServer.Storage.Controller</ComponentType>
<LogLevel>0</LogLevel>
</Configuration>
</UnitMonitor>