|
||||||||||||||||||||||||||||||
13.1 Resource Manager OverviewFor most installations, the Moab Workload Manager uses the services of a resource manager to obtain information about the state of compute resources (nodes) and workload (jobs). Moab also uses the resource manager to manage jobs, passing instructions regarding when, where, and how to start or otherwise manipulate jobs. Moab can be configured to manage more than one resource manager simultaneously, even resource managers of different types. Using a local queue, jobs may even be migrated from one resource manager to another. However, there are currently limitations regarding jobs submitted directly to a resource manager (not to the local queue.) In such cases, the job is constrained to only run within the bound of the resource manager to which it was submitted.
13.1.1 Scheduler/Resource Manager InteractionsMoab interacts with all resource managers using a common set of commands and objects. Each resource manager interfaces, obtains, and translates Moab concepts regarding workload and resources into native resource manager objects, attributes, and commands. Information on creating a new scheduler resource manager interface can be found in the Adding New Resource Manager Interfaces section. 13.1.1.1 Resource Manager CommandsFor many environments, Moab interaction with the resource manager is limited to the following objects and functions:
Using these functions, Moab is able to fully manage workload, resources, and cluster policies. More detailed information about resource manager specific capabilities and limitations for each of these functions can be found in the individual resource manager overviews. (LL, PBS, LSF, SGE, Condor, BProc, or WIKI). Beyond these base functions, other commands exist to support advanced features such as dynamic job support, provisioning, and cluster level resource management. 13.1.1.2 Resource Manager FlowIn general, Moab interacts with resource managers in a sequence of steps each scheduling iteration. These steps are outlined in what follows:
Typically, each step completes before the next step is started. However, with current systems, size and complexity mandate a more advanced parallel approach providing benefits in the areas of reliability, concurrency, and responsiveness. Reliability A number of the resource managers Moab interfaces to were unreliable to some extent. This resulted in calls to resource management APIs which exited or crashed taking the entire scheduler with them. Use of a threaded approach would cause only the calling thread to fail allowing the master scheduling thread to recover. Additionally, a number of resource manager calls would hang indefinitely, locking up the scheduler. These hangs could likewise be detected by the master scheduling thread and handled appropriately in a threaded environment. Concurrency As resource managers grew in size, the duration of each API global query call grew proportionally. Particularly, queries that required contact with each node individually became excessive as systems grew into the thousands of nodes. A threaded interface allowed the scheduler to concurrently issue multiple node queries resulting in much quicker aggregate RM query times. Responsiveness Finally, in the non-threaded serial approach, the user interface was blocked while the scheduler updated various aspects of its workload, resource, and queue state. In a threaded model, the scheduler could continue to respond to queries and other commands even while fresh resource manager state information was being loaded resulting in much shorter average response times for user commands. Under the threaded interface, all resource manager information is loaded and processed while the user interface is still active. Average aggregate resource manager API query times are tracked and new RM updates are launched so that the RM query will complete before the next scheduling iteration should start. Where needed, the loading process uses a pool of worker threads to issue large numbers of node specific information queries concurrently to accelerate this process. The master thread continues to respond to user commands until all needed resource manager information is loaded and either a scheduling-relevant event has occurred or the scheduling iteration time has arrived. At this point, the updated information is integrated into Moab's state information and scheduling is performed. 13.1.2 Resource Manager Specific Details (Limitations/Special Features)
13.1.3 Synchronizing Conflicting InformationMoab does not trust resource manager information. Node, job, and policy information is reloaded on each iteration and discrepancies are detected. Synchronization issues and allocation conflicts are logged and handled where possible. To assist sites in minimizing stale information and conflicts, a number of policies and parameters are available.
13.1.4 Evaluating Resource Manager Availability and PerformanceEach resource manager is individually tracked and evaluated by Moab. Using the mdiag -R command, a site can determine how a resource manager is configured, how heavily it is loaded, what failures, if any, have occurred in the recent past, and how responsive it is to requests. See Also
|
||||||||||||||||||||||||||||||
| © 2001-2008 Cluster Resources, Incorporated | ||||||||||||||||||||||||||||||