Homework 3 - DMJCCLT - dmj.one

Homework 3: Novel Approach to Conflict-Free Distributed Scheduling for Urban Ride-Sharing Platforms

Ride-sharing platforms like Uber and Lyft face significant challenges in dynamically scheduling drivers and passengers in real-time across distributed servers. Your task is to design a distributed scheduling system that ensures:

The goal is to minimize ride allocation latency while maintaining strict fairness in driver-passenger assignments.

1. Background

Use the following distributed systems concepts to guide your solution:

2. Problem Statement

Design a ride allocation system using a distributed key-value store (e.g., Cassandra) for ride-sharing in a metropolitan area. Ensure the following:

3. Instructions

Follow these steps:

  1. Partitioning: Design a partitioning strategy to assign ride requests and drivers to servers. Use hashing or geographic sharding.
  2. Conflict Resolution: Implement a conflict-free replicated data type (CRDT) to resolve conflicts in driver assignments.
  3. Time Order: Use Vector Clocks to ensure causality in updates when multiple servers update driver statuses concurrently.
  4. Replication: Configure replication strategies to tolerate server failures while ensuring strong eventual consistency.
  5. Testing: Simulate network partitions and churn (e.g., drivers frequently entering/exiting the network) and evaluate system performance.

4. Deliverables

Your submission should include:

5. Evaluation Criteria

Your solution will be evaluated based on:

I strongly encourage you to attempt solving the problem independently first. Think creatively about how distributed systems concepts like the CAP theorem, vector clocks, and CRDTs can address the challenges outlined in the assignment. Real-world scenarios like this are best understood through hands-on problem-solving. Take some time to formulate your ideas and test your understanding!

Proposed Solution: Conflict-Free Distributed Scheduling System

Here's a structured approach to solving the problem:

1. Architecture Design

The system architecture involves:

2. Partitioning

Partition rides and drivers based on geographic regions:

3. Conflict-Free Assignments

To prevent multiple drivers being assigned to the same ride:

4. Causal Ordering

Use Vector Clocks to handle concurrency in updates:


def merge_vector_clocks(vc1, vc2):
    return {k: max(vc1.get(k, 0), vc2.get(k, 0)) for k in set(vc1) | set(vc2)}

5. Fault Tolerance and Recovery

Implement fault-tolerant features:

6. Load Balancing

To balance load among drivers:

7. Testing and Simulation

Simulate the system with high churn and network partitions:


import concurrent.futures
import time
import random

def simulate_ride_request(region, ride_id):
    time.sleep(random.uniform(0.1, 0.5))
    print(f"Ride {ride_id} requested in region {region}")

with concurrent.futures.ThreadPoolExecutor() as executor:
    regions = ["R1", "R2", "R3"]
    for i in range(10):
        executor.submit(simulate_ride_request, random.choice(regions), i)

Performance Evaluation

Analyze the following metrics: