1. Introduction and Basics
Mutual exclusion is a fundamental concept in distributed systems, ensuring that multiple processes do not simultaneously access shared resources or critical sections (CS). This guarantees data consistency and correctness by avoiding race conditions.
1.1 Why Mutual Exclusion?
-
Bank Example:
What: Consider a scenario where two ATMs attempt to update the same bank account balance simultaneously.
Why: Without mutual exclusion, both ATMs might read the same initial balance and write conflicting updates, leading to incorrect totals (e.g., overwriting $21,000 with $11,000).
How: By ensuring mutual exclusion, only one ATM updates the balance at a time, preserving the correct final value.
-
Distributed File Systems:
What: File systems shared across multiple machines must lock files to prevent simultaneous modifications.
Why: Concurrent writes to the same file can corrupt its integrity, making it unreadable or incorrect.
How: Mutual exclusion mechanisms ensure that only one process accesses a file at a time, maintaining its consistency.
-
Critical Section Problem:
What: A critical section is a part of code that modifies shared resources and must not be accessed by multiple processes simultaneously.
Why: Without control, multiple processes may execute conflicting operations, leading to unpredictable results.
How: Mutual exclusion protocols manage entry and exit to/from the critical section, ensuring controlled access.
1.2 Requirements
-
Safety:
What: Ensures that at most one process is in the critical section at any time.
Why: This prevents data corruption caused by simultaneous access.
How: Mutual exclusion algorithms guarantee a single process's access by using mechanisms like locks or tokens.
-
Liveness:
What: Guarantees that every request to access the critical section will eventually be granted.
Why: Prevents starvation, where some processes are indefinitely denied access.
How: Algorithms ensure fairness and maintain a queue or logical ordering for requests.
-
Ordering:
What: Ensures that requests are granted access in the order they are made.
Why: This optional property prevents newer requests from bypassing older ones, which can violate logical consistency.
How: Mechanisms like timestamps or causal ordering prioritize requests sequentially.
2. Distributed Mutual Exclusion
Distributed systems, unlike single operating systems, cannot rely on shared memory or global variables. They implement mutual exclusion using message passing across interconnected processes.
2.1 Centralized Algorithm
-
Master Process:
What: A central coordinator, called the master process, manages access to the critical section (CS) using a token system.
Why: This simplifies coordination as only the master process needs to handle requests and responses.
How:
- Processes send a request to the master when they want to enter the CS.
- The master grants access by sending the token to the requesting process, following a First-In-First-Out (FIFO) order.
- Upon exiting the CS, the process returns the token to the master.
-
Analysis:
-
Bandwidth:
What: Measures the number of messages exchanged per operation.
Why: Helps evaluate the communication cost of the algorithm.
How:
enter()
: 1 message to the master (request) + 1 message from the master (token) = 2 messages.exit()
: 1 message to the master (returning the token).
-
Issues:
Single Point of Failure (SPoF): If the master fails, the entire system halts.
Bottleneck: The master process can become overwhelmed with requests, degrading performance.
-
Bandwidth:
2.2 Ring-Based Algorithm
-
Token Ring:
What: Processes are arranged in a logical ring, and a single token circulates among them.
Why: Eliminates the need for a central coordinator, distributing the load evenly.
How:
- Each process can only enter the CS if it holds the token.
- When done, it passes the token to its successor in the ring.
- If a process receives the token but does not need the CS, it immediately forwards the token.
-
Analysis:
-
Client Delay:
What: Time taken for a requesting process to acquire the token when no other process is in the CS.
How:
- Best case: The requesting process already holds the token, so delay is 0.
- Worst case: The token must traverse N hops to return to the requesting process.
-
Synchronization Delay:
What: Time between one process exiting the CS and the next process entering.
How:
- Best case: Token passes to the immediate successor, requiring 1 hop.
- Worst case: Token traverses N-1 hops to reach the next requesting process.
-
Client Delay:
3. Ricart-Agrawala's Algorithm
The Ricart-Agrawala algorithm is a token-less approach to mutual exclusion in distributed systems. It leverages the concepts of causality and Lamport timestamps to ensure correctness, safety, and fairness without relying on a centralized master or token passing.
3.1 Steps
-
Step 1: Multicasting a Request
What: When a process, Pi, wants to enter the critical section (CS), it generates a request message.
Why: To inform all other processes about its intention to enter the CS.
How:
- Attach the current Lamport timestamp (Ti) and the process ID (Pi) to the request message:
<Ti, Pi>
. - Multicast this request to all other N-1 processes in the system.
- Attach the current Lamport timestamp (Ti) and the process ID (Pi) to the request message:
-
Step 2: Waiting for Replies
What: After sending the request, Pi waits to receive
Reply
messages from all N-1 other processes.Why: The replies indicate that no other process has a higher-priority request for the CS.
How:
- Each process receiving a request checks its state:
- If it is in the CS or has a higher-priority request (based on timestamp and process ID), it delays sending the reply and queues Pi’s request.
- If neither condition applies, it immediately sends a reply to Pi.
-
Step 3: Entering the Critical Section
What: Once Pi has received replies from all N-1 processes, it can safely enter the CS.
Why: This ensures that no other process with higher priority is waiting for the CS.
How: Pi transitions to the critical section state and executes its critical code.
-
Step 4: Exiting the Critical Section
What: When Pi finishes its work in the CS, it must allow other processes to proceed.
Why: To ensure fairness and prevent starvation.
How:
- Pi sends
Reply
messages to all queued requests that it had deferred earlier. - It transitions to the
Released
state.
- Pi sends
3.2 Safety Proof
What: The algorithm ensures that at most one process is in the CS at a time.
Why: It avoids conflicting access to shared resources by leveraging Lamport timestamps and reply acknowledgments.
How:
- If two processes, Pi and Pj, simultaneously attempt to access the CS, they exchange requests.
- The process with the earlier timestamp (or lower process ID in case of a tie) receives replies first and enters the CS.
- The other process waits until it receives a reply from the first, ensuring mutual exclusion.
- If Pi were to access the CS simultaneously with Pj, it would imply contradictory timestamp ordering (e.g., Ti < Tj and Tj < Ti), which is impossible under Lamport timestamp rules.
3.3 Performance
-
Bandwidth:
What: Number of messages exchanged during an
enter()
orexit()
operation.How:
enter()
: N-1 request messages + N-1 reply messages = 2(N-1) messages.exit()
: Deferred replies sent to queued processes = N-1 messages in the worst case.
-
Client Delay:
What: Time taken for a process to enter the CS when no other process is using it.
How:
- One round-trip time: Request multicast to all processes and their replies received.
-
Synchronization Delay:
What: Time between one process exiting the CS and the next process entering.
How:
- One message transmission: Reply sent to the waiting process.
4. Maekawa's Algorithm
Maekawa's algorithm refines Ricart-Agrawala by reducing the number of processes involved in granting access to the critical section (CS). Instead of requiring responses from all processes, it uses subsets called voting sets to optimize bandwidth and delay.
4.1 Key Concepts
-
Voting Sets:
What: Each process, Pi, is associated with a voting set, Vi, containing √N processes.
Why: This limits the number of responses required, reducing overhead while ensuring intersection with other voting sets.
How:
- The intersection of any two voting sets, Vi and Vj, is non-empty. This guarantees that no two processes can simultaneously access the CS.
- Voting sets are typically organized by placing processes in a √N × √N matrix, where rows and columns represent voting sets.
-
Voting Requirement:
What: Only votes from a process's voting set are required to enter the CS.
Why: This reduces the total number of messages compared to Ricart-Agrawala, where N-1 processes must reply.
How: Each process manages its voting set and responds only to requests from its own set.
4.2 Steps
-
Request Phase:
What: A process Pi sends a
Request
message to all members of its voting set Vi.How:
- Pi's state changes to
Wanted
. - Each recipient in Vi evaluates the request:
- If it has not already voted, it sends a
Vote
to Pi and marks itself as having voted. - If it has already voted, it queues Pi's request until its current vote is released.
- Pi's state changes to
-
Execution Phase:
What: Pi enters the CS after receiving votes from all members of Vi.
Why: This ensures that no overlapping voting set can grant simultaneous access.
How: Pi's state changes to
Held
, and it executes its critical section code. -
Release Phase:
What: After finishing the CS, Pi sends a
Release
message to all members of Vi.How:
- Recipients update their states to allow pending requests to proceed.
- Queued requests are processed in FIFO order, and votes are sent accordingly.
4.3 Analysis
-
Bandwidth:
What: Measures the number of messages exchanged during
enter()
andexit()
operations.How:
enter()
: Pi sends √N requests to its voting set and receives √N votes, totaling 2√N messages.exit()
: Pi sends √N release messages to its voting set.
-
Issues:
What: While reducing bandwidth, Maekawa's algorithm introduces the risk of deadlocks.
Why: Circular wait conditions can occur when processes block each other while waiting for votes.
How: Deadlock prevention or detection mechanisms, such as timeouts or priority schemes, are required to resolve this issue.
5. Wrap-Up
Mutual exclusion is a cornerstone of distributed system design, ensuring consistency and correctness in accessing shared resources. The choice of algorithm depends on the system’s requirements for safety, liveness, bandwidth, and delay.
-
Centralized Algorithm:
What: Relies on a single master process to coordinate access using a token.
Why: It is simple to implement and guarantees safety and liveness under normal operation.
How:
- Master queues requests and grants the token in FIFO order.
- Drawback: The master is a single point of failure (SPoF) and a performance bottleneck.
-
Ring-Based Algorithm:
What: Decentralizes control by arranging processes in a logical ring and passing a token.
Why: Eliminates the need for a central coordinator and distributes responsibility.
How:
- Each process passes the token to its successor after using or declining it.
- Drawback: High client and synchronization delays, especially in large systems.
-
Ricart-Agrawala Algorithm:
What: A token-less approach leveraging causality and Lamport timestamps.
Why: Avoids the need for a token or central authority, ensuring fairness.
How:
- Processes multicast requests and await replies from all others before entering the CS.
- Drawback: High bandwidth usage due to N-1 request and reply messages.
-
Maekawa's Algorithm:
What: Optimizes Ricart-Agrawala by requiring votes from a subset of processes (voting sets).
Why: Reduces bandwidth compared to Ricart-Agrawala.
How:
- Processes request votes from their voting sets and enter the CS upon receiving all votes.
- Drawback: Prone to deadlocks, which require additional mechanisms to resolve.
In real-world distributed systems, more advanced protocols are used to handle failures and ensure robust mutual exclusion:
- Google Chubby: A locking service using Paxos consensus for fault tolerance. Chubby enables consistent locking and configuration management in Google's internal systems like BigTable and Megastore.
- Apache Zookeeper: A distributed coordination service using Zab (Zookeeper Atomic Broadcast) for leader election and mutual exclusion. Widely used in open-source distributed systems.
These systems demonstrate how consensus protocols like Paxos ensure safety and eventual liveness in complex distributed environments.