Logical Disk Health

Microsoft.Linux.SLES.12.LogicalDisk.DiskHealth.Monitor (UnitMonitor)

SUSE Linux Enterprise Server 12 Logical Disk Health Monitor

Knowledge Base article:

Summary

A logical disk (file system) that was previously online is no longer available.

File system health is determined by inspecting the mount table to identify permanent, mounted file systems. If a mounted file system identified in a previous iteration is not included in the current enumeration, it is considered unhealthy.

Configuration

Default Configuration

Parameter

Default Value

Interval (seconds)

300

Overrides can be used to change the parameter values defined above for all instances or for specific instances or groups.

Causes

An unhealthy state indicates that a file system has gone offline. This may be caused by a disk being unmounted.

Resolutions

Inspect the Logical Disk in the Health Explorer. Health Explorer provides the critical information and a mechanism to remount a file system. You can remount a file system from Health Explorer using the 'Disk Health Mount' recovery task.

Optionally, you can use the Logical Disk description provided by the Health Explorer to manually mount the file system on the affected host with the 'mount' command.

To view file system health you can use the following view:

Disk Health

Element properties:

TargetMicrosoft.Linux.SLES.12.LogicalDisk
Parent MonitorSystem.Health.AvailabilityState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.Unix.WSMan.Status.Filtered.MonitorType
RemotableTrue
AccessibilityPublic
Alert Message
Logical Disk is not online
The status for logical disk {0} is not healthy.
RunAsDefault

Source Code:

<UnitMonitor ID="Microsoft.Linux.SLES.12.LogicalDisk.DiskHealth.Monitor" Accessibility="Public" Target="Microsoft.Linux.SLES.12.LogicalDisk" TypeID="Unix!Microsoft.Unix.WSMan.Status.Filtered.MonitorType" Enabled="true" ParentMonitorID="SystemHealth!System.Health.AvailabilityState">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.Linux.SLES.12.LogicalDisk.DiskHealth.AlertMessage">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Property[Type="Unix!Microsoft.Unix.LogicalDevice"]/DeviceID$</AlertParameter1>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState HealthState="Success" MonitorTypeStateID="StatusOK" ID="StatusOK"/>
<OperationalState HealthState="Error" MonitorTypeStateID="StatusFailed" ID="StatusFailed"/>
</OperationalStates>
<Configuration>
<TargetSystem>$Target/Host/Property[Type="Unix!Microsoft.Unix.Computer"]/NetworkName$</TargetSystem>
<Uri>http://schemas.microsoft.com/wbem/wscim/1/cim-schema/2/SCX_FileSystemStatisticalInformation?__cimnamespace=root/scx</Uri>
<Filter/>
<SplitItems>true</SplitItems>
<Interval>300</Interval>
<InstanceName>$Target/Property[Type="Unix!Microsoft.Unix.LogicalDevice"]/DeviceID$</InstanceName>
<InstanceProperty>/DataItem/WsManData/*[local-name(.)='SCX_FileSystemStatisticalInformation']/*[local-name(.)='Name']</InstanceProperty>
<Status>/DataItem/WsManData/*[local-name(.)='SCX_FileSystemStatisticalInformation']/*[local-name(.)='IsOnline']</Status>
<ExpectedStatus>true</ExpectedStatus>
</Configuration>
</UnitMonitor>