Scheduling
What: Scheduling is the process of organizing, managing, and allocating tasks to available resources in a system. It determines the sequence and execution time of tasks competing for limited resources like processors, memory, and storage.
Why: The need for scheduling arises from resource scarcity and the diversity of tasks demanding attention in systems. Without scheduling, resource conflicts can lead to inefficiency, unbalanced workloads, and unmet system objectives such as throughput, fairness, and response time.
How: Scheduling uses algorithms and techniques to decide which task gets allocated to a resource at any given time. These techniques consider factors such as task priorities, arrival times, and resource requirements. Different scheduling algorithms optimize for specific goals, such as minimizing average completion time or ensuring fairness among competing tasks.
Objectives:
- Maximizing resource utilization: Ensures that resources like CPU and memory are efficiently used without idle time.
- Improving task throughput: Increases the number of tasks completed within a given time frame.
- Optimizing response time: Reduces the waiting time for tasks, making systems more responsive.
- Ensuring fairness: Balances resource allocation among competing tasks or users, preventing resource monopolization.
Context: Scheduling is applied in various domains:
- Single-Processor Systems: Deciding the order of task execution on a single CPU.
- Distributed Systems: Allocating tasks across multiple nodes in a cluster (e.g., Hadoop).
- Multi-Resource Environments: Managing tasks with diverse requirements like CPU, memory, and network bandwidth (e.g., DRF).
1. Single-Processor Scheduling
What: Single-processor scheduling involves managing the execution of multiple tasks on a system with only one processor. The objective is to determine the optimal sequence for task execution to achieve specific performance goals.
Why: In a single-processor system, only one task can execute at a time. Without an efficient scheduling mechanism, longer tasks can delay shorter ones, leading to increased waiting times and reduced overall system performance. Scheduling ensures efficient processor utilization and fair task management.
How: Various algorithms are employed to decide the order of task execution based on their characteristics, such as arrival time, execution time, or priority. Each algorithm has its strengths and trade-offs:
1.1 FIFO/FCFS (First-In First-Out/First-Come First-Serve)
What: The FIFO/FCFS scheduling algorithm executes tasks in the order they arrive. It is straightforward and ensures no task skips the queue.
Why: Simplicity and predictability make it a foundational algorithm for scheduling, especially in systems where task arrival order aligns with the order of processing.
How: Tasks are maintained in a queue, with earlier-arriving tasks at the front. When the processor is free, the task at the head of the queue is executed. This continues until all tasks are processed.
Key Components
- Mechanism: Each task is processed sequentially based on its arrival time.
- Queue: Tasks are stored in a queue sorted by their arrival times.
- Execution: The first task is executed fully before moving to the next.
- Drawback: Longer tasks block subsequent tasks, leading to higher average completion time (convoy effect).
Illustrative Example
Consider tasks with the following properties:
- Task 1: Arrival at time 0, runtime 10.
- Task 2: Arrival at time 6, runtime 5.
- Task 3: Arrival at time 8, runtime 3.
The processing order will follow the task arrival times: Task 1 → Task 2 → Task 3. Below is the Python implementation:
# Example: FIFO/FCFS
tasks = [(0, 10), (6, 5), (8, 3)] # (arrival_time, run_time)
queue = sorted(tasks, key=lambda x: x[0]) # Sort by arrival time
completion_times = []
current_time = 0
for arrival, run_time in queue:
if current_time < arrival:
current_time = arrival # Adjust for idle time
current_time += run_time
completion_times.append(current_time)
average_completion_time = sum(completion_times) / len(tasks)
print(f"Average Completion Time: {average_completion_time}")
Analysis
- Completion Times: Task 1 finishes at 10, Task 2 at 15, and Task 3 at 18.
- Average Completion Time: $$\frac{10 + 15 + 18}{3} = 14.33$$
Advantages
- Simplicity: Easy to implement and understand.
- Fairness: No task skips the queue.
Drawbacks
- Convoy Effect: A long task at the front delays shorter tasks, increasing overall completion time.
- Not Suitable for Interactive Systems: Poor responsiveness for shorter, time-sensitive tasks.
1.2 Shortest Task First (STF)
What: The Shortest Task First (STF) scheduling algorithm prioritizes tasks with the shortest execution times. It is also referred to as Shortest Job First (SJF) in some contexts.
Why: STF is designed to minimize the average completion time, making it optimal for batch processing systems where task lengths are known in advance. By reducing the completion time for shorter tasks, it increases system throughput and responsiveness.
How: Tasks are maintained in a queue sorted by their execution times in ascending order. The task with the shortest run time is selected for execution whenever the processor is free.
Key Components
- Mechanism: Tasks are prioritized based on their execution time, ensuring shorter tasks are completed first.
- Optimality: Minimizes the average completion time for a set of tasks.
- Queue: Tasks are arranged in ascending order of their execution times.
Illustrative Example
Consider tasks with the following properties:
- Task 1: Runtime 10.
- Task 2: Runtime 5.
- Task 3: Runtime 3.
The processing order will be Task 3 → Task 2 → Task 1. Below is a Python implementation:
# Example: Shortest Task First (STF)
tasks = [(0, 10), (6, 5), (8, 3)] # (arrival_time, run_time)
queue = sorted(tasks, key=lambda x: x[1]) # Sort by runtime
completion_times = []
current_time = 0
for arrival, run_time in queue:
if current_time < arrival:
current_time = arrival # Adjust for idle time
current_time += run_time
completion_times.append(current_time)
average_completion_time = sum(completion_times) / len(tasks)
print(f"Average Completion Time: {average_completion_time}")
Analysis
- Completion Times: Task 3 finishes at 3, Task 2 at 8, and Task 1 at 18.
- Average Completion Time: $$\frac{3 + 8 + 18}{3} = 9.66$$
Advantages
- Efficiency: Optimal for minimizing average completion time.
- System Throughput: Ensures quicker processing of shorter tasks, improving system responsiveness.
Drawbacks
- Task Starvation: Longer tasks may be delayed indefinitely if shorter tasks continuously arrive.
- Predictive Requirement: Requires prior knowledge of task run times, which is not always feasible in dynamic systems.
1.3 Round-Robin (RR)
What: Round-Robin (RR) is a preemptive scheduling algorithm where tasks are executed in a cyclic order, each receiving a fixed time slice (quantum) for execution. If a task does not complete within its allocated time slice, it is preempted and moved to the end of the queue.
Why: The algorithm is designed for fairness, ensuring all tasks get an equal share of CPU time. This makes it highly suitable for interactive systems where responsiveness and equitable processing are priorities.
How: Tasks are maintained in a circular queue. The processor executes the task at the head of the queue for one quantum, then preempts it and moves to the next task in the queue. This cycle repeats until all tasks are completed.
Key Components
- Mechanism: Tasks are executed in fixed time slices, and their states are saved if preempted for later continuation.
- Fairness: Ensures no task monopolizes the processor, distributing CPU time equitably.
- Applications: Ideal for interactive systems requiring quick responsiveness, such as user interfaces or time-sharing operating systems.
Illustrative Example
Consider tasks with the following properties:
- Task 1: Runtime 10.
- Task 2: Runtime 5.
- Task 3: Runtime 3.
Assume a quantum of 2 time units. The execution order will involve rotating through the tasks in the queue. Below is a Python implementation:
# Example: Round-Robin (RR)
tasks = [(0, 10), (6, 5), (8, 3)] # (arrival_time, run_time)
quantum = 2
queue = tasks[:] # Create a copy of tasks
completion_times = [0] * len(tasks)
remaining_times = [task[1] for task in tasks]
current_time = 0
while any(rt > 0 for rt in remaining_times): # While tasks are incomplete
for i, (arrival, _) in enumerate(tasks):
if remaining_times[i] > 0 and current_time >= arrival:
execution_time = min(quantum, remaining_times[i])
current_time += execution_time
remaining_times[i] -= execution_time
if remaining_times[i] == 0:
completion_times[i] = current_time
average_completion_time = sum(completion_times) / len(tasks)
print(f"Average Completion Time: {average_completion_time}")
Analysis
- Execution Order: Tasks are processed in a rotating fashion until all are completed.
- Completion Times: Computed based on when tasks fully complete after multiple rounds.
Advantages
- Fairness: Prevents starvation by ensuring all tasks receive CPU time.
- Responsiveness: Suitable for systems requiring regular updates, such as user interfaces.
Drawbacks
- Overhead: Frequent context switching can increase computational overhead.
- Quantum Selection: Choosing an appropriate quantum size is critical. Too small increases overhead; too large reduces responsiveness.
2. Hadoop Scheduling
What: Hadoop scheduling involves allocating computational and storage resources in a distributed Hadoop cluster. The aim is to manage tasks effectively while ensuring fair and efficient resource usage among multiple tenants (users or jobs).
Why: In a distributed environment, resources like CPU, memory, and storage are shared by multiple jobs. Without proper scheduling, resource conflicts can lead to inefficiencies, unfair allocation, and job starvation. Hadoop scheduling ensures that all jobs get adequate resources and that the cluster operates efficiently.
How: Hadoop uses scheduling algorithms to allocate cluster resources dynamically. These algorithms consider factors such as job priorities, user quotas, and task requirements to optimize resource utilization and fairness. Hadoop's scheduling mechanisms are built on its YARN (Yet Another Resource Negotiator) framework.
Key Objectives of Hadoop Scheduling
- Fair Resource Allocation: Distribute resources equitably among jobs or tenants to avoid resource monopolization.
- High Utilization: Maximize the use of cluster resources, minimizing idle times.
- Scalability: Handle a growing number of jobs and users without degrading performance.
- Elasticity: Dynamically adjust resource allocation based on workload changes.
Challenges
- Multi-Tenancy: Multiple users (tenants) submit jobs, requiring the scheduler to balance competing demands.
- Dynamic Resource Needs: Jobs may have varying and unpredictable resource requirements.
- Fairness vs. Efficiency: Balancing equitable resource allocation with optimal cluster utilization.
Hadoop Scheduling Framework
Hadoop uses two primary schedulers for resource allocation:
- Capacity Scheduler: Focuses on dividing cluster resources into queues, each with a guaranteed portion of the cluster's capacity.
- Fair Scheduler: Ensures equitable resource distribution by allocating equal shares of the cluster to each user or job.
Applications
- Large-scale data processing with multiple jobs (e.g., MapReduce tasks).
- Multi-tenant environments where multiple organizations or users share a single cluster.
- Dynamic workloads requiring real-time adjustment of resource allocation.
2.1 Hadoop Capacity Scheduler
What: The Hadoop Capacity Scheduler is a resource management mechanism in Hadoop that divides cluster resources into queues. Each queue is assigned a specific percentage of the cluster's resources, ensuring predictable resource allocation for different tenants or jobs.
Why: This scheduler is designed for environments with multi-tenant workloads where guaranteed resource allocation and predictability are critical. It ensures fairness while allowing resource elasticity to maximize utilization.
How: Resources are divided into queues, and jobs within each queue are managed based on arrival order (typically FIFO). Administrators configure queues with soft and hard resource limits, which govern resource allocation dynamically.
Key Features
- Queues: Each queue is allocated a guaranteed percentage of cluster resources. For example:
- Queue 1: 80% of resources, for high-priority jobs.
- Queue 2: 20% of resources, for lower-priority jobs.
- Elasticity: Unused resources from one queue can be temporarily reallocated to other queues with higher demand. For instance, if Queue 2 has no active jobs, its resources may be allocated to jobs in Queue 1.
- Preemption: Not supported to avoid task disruptions. Instead, resource reallocation occurs gradually as running tasks complete, ensuring tasks are not interrupted mid-execution.
Mechanism
- Soft Limit: Specifies the guaranteed percentage of cluster resources for a queue. Resources beyond the soft limit can be allocated to other queues if idle.
- Hard Limit: Defines the maximum percentage of resources a queue can occupy, ensuring no queue monopolizes the cluster.
- Hierarchical Queues: Queues can have sub-queues, each inheriting and sharing resources from the parent queue, enabling fine-grained control of resource allocation.
Advantages
- Predictability: Guarantees a minimum resource allocation for each queue, ensuring critical jobs receive necessary resources.
- Elastic Resource Allocation: Dynamically reallocates unused resources, improving overall utilization.
- Avoids Disruption: Preemption-free reallocation prevents task termination and wasted work.
Drawbacks
- Delayed Reallocation: Resource reallocation relies on task completion, which may be slow in high-load scenarios.
- Queue Configuration Complexity: Requires careful configuration of queue limits to balance fairness and efficiency.
Use Case
Ideal for multi-tenant Hadoop environments where resource guarantees and workload isolation are crucial, such as organizations with distinct departments sharing a single cluster for batch and analytical processing.
2.2 Hadoop Fair Scheduler
What: The Hadoop Fair Scheduler allocates resources dynamically in a Hadoop cluster, ensuring each user or job gets a fair share of resources. It focuses on equitable distribution and supports preemption to free resources for underutilized pools.
Why: This scheduler is designed for environments requiring fairness among competing jobs or users. By dynamically adjusting resource allocation, it ensures that all jobs make progress, preventing resource hogging and starvation.
How: Resources are divided into pools, typically one per user or job. Jobs within each pool share resources either equally (fair-share scheduling) or based on arrival order (FIFO). When resource utilization drops in underutilized pools, the scheduler preempts tasks in other pools to rebalance resource distribution.
Key Features
- Pools: Resources are divided among pools, ensuring fair allocation. For instance:
- Pool A: Allocated 50% of cluster resources.
- Pool B: Allocated 50% of cluster resources.
- Preemption: The scheduler terminates low-priority tasks in overutilized pools to reallocate resources to underutilized pools. This ensures minimum guarantees for all pools are met.
- Configurable Scheduling: Within pools, tasks can follow:
- FIFO: Tasks are executed in the order they arrive.
- Fair-Share: Resources are distributed equally among tasks within a pool.
Mechanism
- Minimum Shares: Pools can have minimum resource guarantees. If these are not met, resources are taken from other pools through preemption.
- Task Selection for Preemption: Recently started tasks are preempted first to minimize wasted computation.
- Idempotent Tasks: Killed tasks can restart without side effects, as they process the same inputs and generate consistent outputs.
Advantages
- Fairness: Ensures resources are equitably distributed among all users or jobs.
- Resource Reallocation: Dynamic adjustment of resource allocation prevents stagnation in underutilized pools.
- Flexibility: Configurable scheduling policies within pools accommodate diverse workload requirements.
Drawbacks
- Wasted Computation: Preemption can lead to lost work, especially for long-running tasks.
- Complexity: Managing multiple pools and scheduling configurations can be challenging in large-scale clusters.
Use Case
The Hadoop Fair Scheduler is ideal for clusters with diverse, competing workloads where fairness and adaptability are crucial, such as shared research environments or public cloud-based data processing clusters.
3. Dominant-Resource Fair Scheduling (DRF)
What: Dominant-Resource Fair Scheduling (DRF) is a resource allocation strategy designed for environments where tasks or jobs have diverse multi-resource requirements, such as CPU, memory (RAM), disk, or network bandwidth. DRF ensures fairness by focusing on the dominant resource usage of each task.
Why: Traditional scheduling strategies may fail to ensure fairness in multi-resource environments, as they often optimize for a single resource (e.g., CPU). DRF addresses this by accounting for the varying needs of tasks, ensuring that no job disproportionately consumes its most critical resource (dominant resource).
How: DRF allocates resources by equalizing the dominant resource usage percentages across all tasks or jobs. This ensures that each job gets a fair share of the resource it depends on most, leading to balanced and equitable cluster utilization.
3.1 Concept
Key Components
- Dominant Resource: The resource a task consumes the highest percentage of compared to the total cluster capacity. For instance:
- Job 1: Requires 2 CPUs and 8 GB RAM in a cluster with 18 CPUs and 36 GB RAM. Dominant resource = RAM (8/36 = 2/9).
- Job 2: Requires 6 CPUs and 2 GB RAM. Dominant resource = CPU (6/18 = 1/3).
- Fairness Rule: DRF equalizes the usage percentage of the dominant resource across all jobs. In the example above, both Job 1 and Job 2 will get an equal share of their respective dominant resources.
- Applications: DRF is used in distributed systems like Mesos and Hadoop clusters where multi-resource fairness is critical.
Mechanism
- Determine the dominant resource for each job by calculating the highest percentage of total cluster capacity consumed by any resource.
- Allocate tasks to jobs so that the usage percentage of the dominant resource remains equal across all jobs.
- Repeat allocation until all jobs are either complete or cluster resources are exhausted.
Illustrative Example
Consider a cluster with 18 CPUs and 36 GB RAM:
- Job 1: Tasks require 2 CPUs and 8 GB RAM (dominant resource = RAM).
- Job 2: Tasks require 6 CPUs and 2 GB RAM (dominant resource = CPU).
DRF allocates:
- Job 1: 3 tasks, consuming 6 CPUs and 24 GB RAM.
- Job 2: 2 tasks, consuming 12 CPUs and 4 GB RAM.
Both jobs use 2/3 of their dominant resource, satisfying the fairness rule.
Advantages
- Fairness: Ensures equitable allocation of dominant resources, preventing resource monopolization.
- Multi-Resource Support: Handles diverse resource requirements effectively.
- Strategy-Proof: Prevents tenants from gaming the system by misreporting resource requirements.
Drawbacks
- Complexity: Requires accurate computation of dominant resources and careful allocation to maintain fairness.
- Underutilization: May leave some non-dominant resources underutilized in the interest of fairness.
Applications
- Mesos: A distributed operating system for cloud environments.
- Hadoop: Managing diverse workloads with varying resource requirements.
- Cloud Systems: Fair scheduling in multi-tenant cloud infrastructures.
3.2 Example
Scenario: A cluster with 18 CPUs and 36 GB RAM is shared by two jobs with distinct resource requirements:
- Job 1: Each task requires 2 CPUs and 8 GB RAM.
- Job 2: Each task requires 6 CPUs and 2 GB RAM.
Step 1: Identify Dominant Resources
The dominant resource for each job is the resource that consumes the highest percentage of the cluster's total capacity:
- Job 1:
- CPU usage per task = $$\frac{2}{18} = \frac{1}{9}$$
- RAM usage per task = $$\frac{8}{36} = \frac{2}{9}$$
- Dominant Resource: RAM (2/9).
- Job 2:
- CPU usage per task = $$\frac{6}{18} = \frac{1}{3}$$
- RAM usage per task = $$\frac{2}{36} = \frac{1}{18}$$
- Dominant Resource: CPU (1/3).
Step 2: Equalize Dominant Resource Usage
DRF ensures that each job gets the same percentage of its dominant resource:
- Job 1: Dominant resource usage = $$\frac{\text{Number of tasks} \times 8}{36}$$.
- Job 2: Dominant resource usage = $$\frac{\text{Number of tasks} \times 6}{18}$$.
Set the dominant resource usage to be equal:
$$\frac{\text{Number of tasks for Job 1} \times 8}{36} = \frac{\text{Number of tasks for Job 2} \times 6}{18}$$.
Step 3: Allocate Tasks
Solving for the task allocation:
- Job 1: Allocates 3 tasks, consuming:
- CPU = $$3 \times 2 = 6$$
- RAM = $$3 \times 8 = 24$$
- Job 2: Allocates 2 tasks, consuming:
- CPU = $$2 \times 6 = 12$$
- RAM = $$2 \times 2 = 4$$
Step 4: Verify Fairness
Each job uses 2/3 of its dominant resource:
- Job 1: RAM usage = $$\frac{24}{36} = \frac{2}{3}$$.
- Job 2: CPU usage = $$\frac{12}{18} = \frac{2}{3}$$.
This ensures fairness as both jobs consume equal shares of their respective dominant resources.
Step 5: Check Cluster Utilization
Total resource consumption:
- CPU: $$6 + 12 = 18$$ (fully utilized).
- RAM: $$24 + 4 = 28$$ (underutilized, as fairness prioritizes dominant resource).
Conclusion
This allocation satisfies DRF's fairness rule, ensuring both jobs receive an equitable share of their critical resources while efficiently using the cluster's capabilities.