This monitor checks if there are multiple communication problems related to retrieving the 'Temperature Performance Monitor' information of the Fujitsu Out-Of-Band Server within a defined timespan.
This monitor checks the if there have been multiple communication problems logged within a certain time period when accessing the 'Temperature Performance Monitor' information of the Fujitsu Out-Of-Band Server. This typically indicates a networking problem or an internal resource problem with the iRMC itself.
Note: The health state resets itself to Success (OK) if there are no more events logged within a second configurable time period!
The iRMC is no longer answering any requests over the network.
The iRMC Web Server is no longer reliable answering https requests over the network.
Check if the iRMC can be reached over the network with ping. If not, please contact your Network Administrator.
Check if the iRMC Web Interface can be reached over HTTP or HTTPS. If the iRMC does respond to pings but not to HTTP or HTTPS requests this typically indicated an internal resource problem of the iRMC.
Check if the iRMC can be reached over the network with an IPMI based tool.
If the problem persists, reboot the iRMC (and not the Out-Of-Band Server!) with the help of an IPMI tool such as ipmiview32/ipmiview64 from Fujitsu or any Open Source tool for IPMI such as ipmiutil (see http://ipmiutil.sourceforge.net/ ), FreeIPMI ( see http://www.gnu.org/software/freeipmi/) or ipmitool (see http://sourceforge.net/projects/ipmitool/) .
If you do not have access to an IPMI tool or the iRMC does not answer to IPMI requests you need to A/C fail the server by unplugging all power cables and wait at least 60 seconds before connecting the server to your power source again.
Target | Fujitsu.Servers.PRIMERGY.OutOfBand.CommunicationMonitor | ||
Parent Monitor | System.Health.AvailabilityState | ||
Category | AvailabilityHealth | ||
Enabled | True | ||
Alert Generate | True | ||
Alert Severity | MatchMonitorHealth | ||
Alert Priority | Normal | ||
Alert Auto Resolve | True | ||
Monitor Type | Fujitsu.Servers.PRIMERGY.OutOfBand.SelfResolvingRepeatedEventsMonitorType | ||
Remotable | True | ||
Accessibility | Public | ||
Alert Message |
| ||
RunAs | Default |
<UnitMonitor ID="Fujitsu.Servers.PRIMERGY.OutOfBand.PerfMon.Temperature.RepeatedCommunicationProblem" Accessibility="Public" Enabled="true" Remotable="true" Priority="Normal" ConfirmDelivery="true" Target="FujitsuOutOfBand!Fujitsu.Servers.PRIMERGY.OutOfBand.CommunicationMonitor" ParentMonitorID="Health!System.Health.AvailabilityState" TypeID="FujitsuOutOfBand!Fujitsu.Servers.PRIMERGY.OutOfBand.SelfResolvingRepeatedEventsMonitorType">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Fujitsu.Servers.PRIMERGY.OutOfBand.PerfMon.Temperature.RepeatedCommunicationProblem_AlertMessageResourceID">
<AlertOnState>Warning</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>MatchMonitorHealth</AlertSeverity>
<AlertParameters>
<AlertParameter1>$Target/Host/Property[Type="FTSLIB!Fujitsu.ServerView.Server"]/NetworkName$</AlertParameter1>
<AlertParameter2>$Data/Context/Count$</AlertParameter2>
<AlertParameter3>$Data/Context/TimeWindowStart$</AlertParameter3>
<AlertParameter4>$Data/Context/TimeWindowEnd$</AlertParameter4>
<AlertParameter5>$Data/Context/TimeFirst$</AlertParameter5>
<AlertParameter6>$Data/Context/TimeLast$</AlertParameter6>
</AlertParameters>
</AlertSettings>
<OperationalStates>
<OperationalState ID="RepeatedEventRaised" MonitorTypeStateID="RepeatedEventRaised" HealthState="Warning"/>
<OperationalState ID="RepeatedEventReset" MonitorTypeStateID="RepeatedEventReset" HealthState="Success"/>
</OperationalStates>
<Configuration>
<ComputerName>.</ComputerName>
<LogName>Operations Manager</LogName>
<FilterExpression>
<And>
<Expression>
<SimpleExpression>
<ValueExpression>
<XPathQuery Type="UnsignedInteger">EventDisplayNumber</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<!-- $ERROR_NO_TEMP_INFORMATION -->
<Value Type="UnsignedInteger">8233</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
<Expression>
<SimpleExpression>
<ValueExpression>
<!-- IP Address ... -->
<XPathQuery Type="String">Params/Param[1]</XPathQuery>
</ValueExpression>
<Operator>Equal</Operator>
<ValueExpression>
<Value Type="String">$Target/Host/Property[Type="FTSLIB!Fujitsu.ServerView.Server"]/NetworkName$</Value>
</ValueExpression>
</SimpleExpression>
</Expression>
</And>
</FilterExpression>
<ConsolidationEventDisplayNumber>EventDisplayNumber</ConsolidationEventDisplayNumber>
<ConsolidationPublisherName>PublisherName</ConsolidationPublisherName>
<!-- Default monitoring is every 900 Seconds, report 3 consecutive failed attempts within 2 hours -->
<RepeatedEventCount>3</RepeatedEventCount>
<IntervalSeconds>7200</IntervalSeconds>
<!-- slightly larger than 3 times default monitoring interval -->
<NoEventIntervalSeconds>3000</NoEventIntervalSeconds>
</Configuration>
</UnitMonitor>