Algorithm Complexity: CSU1051

Complexity of an Algorithm

1. Introduction

In computer science, the complexity of an algorithm refers to the amount of resources, such as time and space, that are required to solve a particular problem. Analyzing the complexity of algorithms helps us to understand their efficiency and to select the most suitable algorithm for a given problem. This article provides an in-depth exploration of algorithm complexity, from basic concepts to more advanced topics, using illustrative examples to demonstrate the analysis process.

2. Types of Algorithm Complexity

There are two main types of algorithm complexity: time complexity and space complexity. Time complexity describes the amount of time an algorithm takes to execute as a function of the input size, while space complexity describes the amount of memory an algorithm uses as a function of the input size.

2.1 Time Complexity

Time complexity is typically expressed using big O notation, which describes the upper bound of an algorithm's growth rate. The notation $O(f(n))$ indicates that the time complexity of an algorithm is at most a constant multiple of the function $f(n)$, where $n$ is the input size. Common time complexities include $O(1)$ (constant time), $O(\log n)$ (logarithmic time), $O(n)$ (linear time), $O(n \log n)$ (linearithmic time), $O(n^2)$ (quadratic time), and $O(n^3)$ (cubic time). Higher-order polynomial and exponential time complexities are also possible but are generally considered to be inefficient.

2.2 Space Complexity

Space complexity, like time complexity, is also expressed using big O notation. The notation $O(g(n))$ indicates that the space complexity of an algorithm is at most a constant multiple of the function $g(n)$, where $n$ is the input size. Common space complexities include $O(1)$ (constant space), $O(\log n)$ (logarithmic space), and $O(n)$ (linear space).

3. Analyzing Algorithm Complexity

To analyze the complexity of an algorithm, we must first understand the underlying operations and how they contribute to the overall resource usage. In this section, we will discuss various methods for analyzing time and space complexity, including the use of recurrence relations and the master theorem.

3.1 Recurrence Relations

Recurrence relations are mathematical expressions that define a sequence recursively, using previous terms in the sequence to calculate the current term. They can be used to describe the time complexity of recursive algorithms.

3.1.1 Example: Fibonacci Numbers

The Fibonacci sequence is defined as follows:

$$F_n = \begin{cases} 0 & \text{if } n = 0\\ 1 & \text{if } n = 1\\ F_{n-1} + F_{n-2} & \text{if } n > 1 \end{cases}$$

// Recursive Fibonacci
int fibonacci(int n) {
  if (n == 0) return 0;
  if (n == 1) return 1;
  return fibonacci(n - 1) + fibonacci(n - 2);
}

This algorithm has a time complexity of $O(2^n)$, as each call to the function branches into two more calls, leading to an exponential growth in the number of calls. This can be visualized as a binary tree, where each node has two children. The height of the tree is $n$, and the total number of nodes is approximately $2^n$. The space complexity of this algorithm is $O(n)$ due to the maximum depth of the recursion.

3.2 The Master Theorem

The Master Theorem is a powerful tool for analyzing the time complexity of divide-and-conquer algorithms. It provides a formula for solving recurrence relations of the form:

$$T(n) = a T\left(\frac{n}{b}\right) + f(n)$$

where $a \geq 1$, $b > 1$, and $f(n)$ is an asymptotically positive function. The Master Theorem classifies the solution to the recurrence relation into three cases:

1. If $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, then $T(n) = \Theta(n^{\log_b a})$.

2. If $f(n) = \Theta(n^{\log_b a})$, then $T(n) = \Theta(n^{\log_b a} \log n)$.

3. If $f(n) = \Omega(n^{\log_b a + \epsilon})$ for some $\epsilon > 0$, and if $af\left(\frac{n}{b}\right) \leq cf(n)$ for some constant $c < 1$ and all sufficiently large $n$, then $T(n)=\Theta(f(n))$.

3.2.1 Example: Merge Sort

Merge Sort is a divide-and-conquer algorithm that sorts an array by recursively splitting it into two halves, sorting each half, and merging the sorted halves back together. The time complexity of Merge Sort can be described by the recurrence relation:

$$T(n) = 2T\left(\frac{n}{2}\right) + O(n)$$

void merge(int arr[], int l, int m, int r) {
  // ...
}

void mergeSort(int arr[], int l, int r) {
  if (l < r) {
    int m = l + (r - l) / 2;
    mergeSort(arr, l, m);
    mergeSort(arr, m + 1, r);
    merge(arr, l, m, r);
  }
}

Applying the Master Theorem, we see that $a = 2$, $b = 2$, and $f(n) = O(n)$. Since $\log_2 2 = 1$, we have $f(n) = \Theta(n^{\log_b a})$, which corresponds to case 2 of the Master Theorem. Therefore, the time complexity of Merge Sort is $T(n) = \Theta(n \log n)$. The space complexity of Merge Sort is $O(n)$, as additional memory is needed to store the temporary arrays during the merge step.

3.2.2 Example: Binary Search

Binary Search is an efficient algorithm for finding a target value within a sorted array. It works by repeatedly dividing the array in half until the target value is found or the interval has been reduced to zero. The time complexity of Binary Search can be described by the recurrence relation:

$$T(n) = T\left(\frac{n}{2}\right) + O(1)$$

int binarySearch(int arr[], int l, int r, int x) {
  if (r >= l) {
    int mid = l + (r - l) / 2;
    if (arr[mid] == x)
      return mid;
    if (arr[mid] > x)
      return binarySearch(arr, l, mid - 1, x);
    return binarySearch(arr, mid + 1, r, x);
  }
  return -1;
}

Applying the Master Theorem, we see that $a = 1$, $b = 2$, and $f(n) = O(1)$. Since $\log_2 1 = 0$, we have $f(n) = O(n^{\log_b a - \epsilon})$ for some $\epsilon > 0$, which corresponds to case 1 of the Master Theorem. Therefore, the time complexity of Binary Search is $T(n) = \Theta(\log n)$. The space complexity of the recursive version of Binary Search is $O(\log n)$ due to the depth of the recursion, but an iterative version of Binary Search can be implemented with a space complexity of $O(1)$.

4. Amortized Analysis

Amortized analysis is a method for analyzing the average-case performance of an algorithm over a sequence of operations. It allows us to determine the complexity of an algorithm when the cost of a single operation can vary, but the overall cost can be bounded across multiple operations.

4.1 Aggregate Method

The aggregate method involves determining the total cost of a sequence of operations and dividing by the number of operations to find the average cost per operation. This gives an amortized complexity for each operation.

4.1.1 Example: Dynamic Array Insertion

A dynamic array is an array that can be resized during runtime. When an element is inserted into a dynamic array that is full, the array is resized by allocating a new block of memory, usually twice the size of the current array, and copying the elements from the old array to the new one. The cost of resizing the array is $O(n)$, but this operation does not occur for every insertion.

// Dynamic array insertion
void insert(int x, int arr[], int &size, int &capacity) {
  if (size == capacity) {
    capacity *= 2;
    int *new_arr = new int[capacity];
    for (int i = 0; i < size; i++) {
      new_arr[i] = arr[i];
    }
    delete[] arr;
    arr = new_arr;
  }
  arr[size] = x;
  size++;
}

Over a sequence of $n$ insertions, the total cost can be bounded by $O(n)$, as each element is copied at most once for each power of two smaller than the final size. Thus, the amortized complexity of each insertion is $O(1)$, even though the worst-case complexity of a single insertion is $O(n)$.

4.2 Accounting Method

The accounting method involves assigning a "charge" to each operation, representing the amortized cost. Some operations might be charged more than their actual cost, while others might be charged less. The key is to ensure that the total charge across all operations is at least the total cost, so that the average cost per operation is accurately represented.

4.2.1 Example: Stack with Multipop

Consider a stack data structure that supports three operations: push, pop, and multipop (which pops multiple elements from the stack at once). The push and pop operations have a constant cost of $O(1)$. The multipop operation, which pops $k$ elements from the stack, has a cost of $O(k)$, where $k \leq n$ (the number of elements in the stack).

// Stack with multipop
void push(int x, std::stack &st) {
  st.push(x);
}

void pop(std::stack &st) {
  if (!st.empty()) {
    st.pop();
  }
}

void multipop(std::stack &st, int k) {
  for (int i = 0; i < k && !st.empty(); i++) {
    st.pop();
  }
}

We can assign an amortized cost of $1$ to each push operation, and an amortized cost of $0$ to each pop and multipop operation. Each time an element is pushed onto the stack, it "prepays" for the cost of its eventual removal by pop or multipop. Since the total charge is equal to the number of push operations, and each element can be removed at most once, the amortized complexity of each operation is $O(1)$.

4.3 Potential Method

The potential method involves defining a potential function $\Phi$ that maps the state of the data structure to a non-negative real number. The amortized cost of an operation is then defined as the actual cost of the operation plus the change in potential. The goal is to choose a potential function that accurately reflects the cost distribution across the sequence of operations.

4.3.1 Example: Dynamic Array Insertion (Revisited)

Returning to the dynamic array insertion example, we can define a potential function $\Phi$ that measures the difference between the size $n$ of the array and its capacity $c$:

$$\Phi(n, c) = 2n - c$$

The amortized cost of each operation can then be calculated as follows:

1. For an insertion without resizing, the actual cost is $1$, and the potential function increases by $2$. The amortized cost is $1 + (2 - 0) = 3$.

2. For an insertion with resizing, the actual cost is $n + 1$, and the potential function decreases by $n$. The amortized cost is $(n + 1) + (0 - n) = 1$.

Since the amortized cost of each insertion operation is constant, the overall complexity of the dynamic array insertion remains $O(1)$ amortized.

5. Conclusion

In this article, we have provided a comprehensive overview of algorithm complexity, including time and space complexity, recurrence relations, the Master Theorem, and amortized analysis. We have illustrated these concepts with detailed examples, analyzing the complexity of algorithms step by step. By understanding the complexities of various algorithms, we can make informed decisions about which algorithm to use for a specific problem and optimize the performance of our programs.