A Dynamic Space Sharing Method For Resource Management Gabriel Mateescu

A Dynamic Space Sharing Method For Resource Management • Gabriel Mateescu Research Computing Support Group National Research Council Canada Gabriel.Mateescu@nrc.ca HPCS 2001 presentation Windsor, Ontario, June 19-20, 2001

Agenda • Motivation • Outline of the Approach • Job Taxonomy • Pseudocode • Evaluation

Motivation • Continuously increasing demand for computation resources is met by clusters, distributed or shared memory supercomputers • Dual objective: • optimization resource utilization, high throughput • quality of service to users: low turn-around time • Batch scheduling based on space sharing and static partitioning has limited scalability • Main contribution: provide a method for achieving both high resource utilization and low turn-around time

The Problem • Parallel supercomputer/cluster shared among a number of divisions • Dual objective: • optimization resource utilization, high throughput • quality of service to users: low turn-around time • Batch scheduling based on space sharing and static partitioning has limited scalability • Main contribution: provide a method for achieving both high resource utilization and low turn-around time

Example Parallel Computer Biotech Department Physics Department Node (CPU + memory) Partition boundary Job Requests

Outline of the Approach • Dynamic space sharing method for batch job scheduling • Partition the resources into a set of dedicated queues • Dedicated queues own resources • Free resources can be borrowed by pending jobs for which there are not enough per queue resources • Borrowed resources are grouped in a shared queue • Borrowed resources can be reclaimed by the lending queue • Reclaiming is done by checkpointing jobs which hold borrowed resources

Outline • The sum of the resources assigned to jobs in a dedicated queue does not exceed the resource limits of the queue • The difference between the total amount of resources and the resources currently assigned to the dedicated queues represents opportunity for scheduling jobs for which there are not enough per-queue resources • Each user belongs to a group and each group is authorized to submit jobs to some dedicated queues as well as to the shared queue

Dedicated Resources Job 1 in queue 1: 1 x resource 1 + 2 x resource 2 Job 2 in queue 2: 2 x resource 1 + 1 x resource 2 Resource 1 Resource 2 Queue 1 Queue 2

Borrowed Resources Job 3 in queue 1: 1 x resource 1 + 2 x resource 2 Resource 1 Resource 2 Queue 1 Queue 2

Resource Reclaiming Job 4 in queue 2: 1 x resource 1 + 2 x resource 2 Resource 1 Resource 2 Queue 1 Queue 2

Paths of a Job Submit queue new job Dedicated queue Dedicated queue Dedicated queue finished job Shared Queue

Job Taxonomy • master job has resource requirements which can be satisfied from the free resources available to the queue • fittable job uses resources which can be satisfied by reclaiming some resources borrowed by the shared queue • movable job has resource requirements which exceed the amount of resources owned by the queue and not already allocated to jobs; however, the requirements of such a job may be satisfied from the system-wide free resources • blocked job there are not enough resources, either owned by its queue or available in other queues, that can satisfy the job's requirements. Or the job is not checkpointable

Job State Transition Diagram Preempt slave new job pending job enough per-queue resources movable job master job fittable job preempt slave slave job start job running master running slave finished job

scheduler ( ) { queues = sort_dedicated_queues(); while ( scheduling_is_on ) { new_jobs = get jobs_in_submit queue(); dispatch_to_dedicated_queue(new_jobs); foreach queue in ( queues ) { jobs = get_pending_jobs(queue); order_jobs (jobs); foreach job in ( jobs ) { type = get_job_type(job); resources = get_job_resources(job); if ( type == master || type == fittable ) { if ( type == fittable) { victim_jobs = reclaim(resources); re_queue(preempted jobs); } allocate_resources(resources); start_job(job); } else if (type == movable ) { ok = system_resources(resources); if ( ok ) { move_to_shared_queue(job); start_job(job); } } } } } Pseudocode

Job Statistics • SGI Origin 2000 with 108 CPUs and 48 GB of main memory • Resources are partitioned among six dedicated queues defined for six groups of users • Average system load, including short interactive jobs ~ 94 • Total jobs running 33, CPUs allocated=103, memory=39 GB • Slave jobs running 11, CPUs allocated =22, memory=11 GB • Jobs Waiting 3 • Checkpoints/day per slave job ~1.5

Advantages • Combine the advantages of space sharing and time sharing scheduling • Space sharing gives resource allocation for the duration of the job and predictable execution time • Time sharing improves resource utilization • We combine space sharing with job preemption • Selection of which jobs are preempted is made in terms of the current usage of the resources, rather than based on a static job priority

Evaluation • Complexity: O(J N R + J log J) J = number of pending or slave jobs, N = number of supernodes; R = number of types of resources • Reduce the waiting time of the jobs by harnessing resources not used by the dedicated queues • Reduce job execution-time by reserving resources for all but the slave the jobs • No job fitting in a dedicated queue can be prevented from running by a slave job

A Dynamic Space Sharing Method For Resource Management Gabriel Mateescu

A Dynamic Space Sharing Method For Resource Management Gabriel Mateescu

Presentation Transcript

Dynamic Resource Management in a Static Network Operating System

Resource Management for Dynamic Service Chain Adaptation

Resource Management Contracts: Innovative Solid Waste Management Method

Dynamic Resource Management

Container-Based Job Management for Fair Resource Sharing

Rethinking Resource Sharing

Current trends in library resource management, discovery, and resource sharing

Global issues for resource sharing

Resource Sharing Rocks!

Resource Sharing:

Dynamic Ride-sharing

Symbiotic Space-Sharing: Mitigating Resource Contention on SMP Systems

Software Tools for Dynamic Resource Management

Toward Global HPC Platforms Gabriel Mateescu Research Computing Support Group

Sharing our Resource Sharing

Revolutionizing Resource Sharing

Strategies for Implementing Dynamic Load Sharing

Dynamic Resource Management for Virtualization HPC Environments

Resource Sharing Over a Network

Resource Sharing Over a Network

Resource Management for Dynamic Service Chain Adaptation

Approximate Dynamic Programming Methods for Resource Constrained Sensor Management