Job Scheduler Service

Microsoft.HPC.2008R2.Monitor.JobScheduler.Availability.JobSchedulerService (UnitMonitor)

Job scheduler service availability monitor for HPC 2008 R2

Knowledge Base article:

Summary

This monitor tracks the status of the HPC Job Scheduler Service. When this service is stopped, no new jobs can be submitted, no jobs in the queue will begin running, and no new jobs or tasks will be started. The tasks that are already running will complete.

Configuration

In a cluster configured for high availability of the head node, the HPC Job Scheduler Service is not configured to start automatically on both of the head nodes, and the “Alert only if service startup type is automatic” option is set to True by default in the management pack. To monitor the HPC Job Scheduler Service on a failover cluster by using the management pack, you must manually change the “Alert only if service startup type is automatic” option to False on the current active head node. The monitoring tools in Failover Cluster Manager can also be used to monitor the service.

Causes

This error can be caused by any of the following:

Resolutions

To troubleshoot and fix this problem:

If the preceding steps do not resolve the problem, uninstall and reinstall the HPC Pack on the head node.

Additional

A recovery task will be run automatically to restart the service, so you may find the service keeps restarting while you are trying to stop it. There are a couple of options to avoid this:

For more information about high availability head nodes, see “Configuring Windows HPC Server 2008 R2 for High Availability of the Head Node” ( http://go.microsoft.com/fwlink/?LinkId=198285).

Element properties:

TargetMicrosoft.HPC.2008R2.JobScheduler
Parent MonitorSystem.Health.AvailabilityState
CategoryAvailabilityHealth
EnabledTrue
Alert GenerateTrue
Alert SeverityError
Alert PriorityNormal
Alert Auto ResolveTrue
Monitor TypeMicrosoft.Windows.CheckNTServiceStateMonitorType
RemotableTrue
AccessibilityPublic
Alert Message
HPC Job Scheduler Service is not running
Please see the alert context for details.
RunAsMicrosoft.HPC.RunAsProfile.AdminActionAccount

Source Code:

<UnitMonitor ID="Microsoft.HPC.2008R2.Monitor.JobScheduler.Availability.JobSchedulerService" Accessibility="Public" Enabled="true" Target="Microsoft.HPC.2008R2.JobScheduler" ParentMonitorID="Health!System.Health.AvailabilityState" Remotable="true" Priority="Normal" RunAs="HPCLibrary!Microsoft.HPC.RunAsProfile.AdminActionAccount" TypeID="Windows!Microsoft.Windows.CheckNTServiceStateMonitorType" ConfirmDelivery="false">
<Category>AvailabilityHealth</Category>
<AlertSettings AlertMessage="Microsoft.HPC.2008R2.Monitor.JobScheduler.Availability.JobSchedulerService_AlertMessageResourceID">
<AlertOnState>Error</AlertOnState>
<AutoResolve>true</AutoResolve>
<AlertPriority>Normal</AlertPriority>
<AlertSeverity>Error</AlertSeverity>
</AlertSettings>
<OperationalStates>
<OperationalState ID="Running" MonitorTypeStateID="Running" HealthState="Success"/>
<OperationalState ID="NotRunning" MonitorTypeStateID="NotRunning" HealthState="Error"/>
</OperationalStates>
<Configuration>
<ComputerName>$Target/Host/Host/Property[Type="Windows!Microsoft.Windows.Computer"]/NetworkName$</ComputerName>
<ServiceName>HpcScheduler</ServiceName>
<CheckStartupType/>
</Configuration>
</UnitMonitor>