Bubble Sort Algorithm: CSU1051 - Shoolini U

Bubble Sort

Executive Summary

The beauty of data structures and algorithms lies not just in their utility but in the complexity that lies within their simplicity. One such algorithm is the Bubble Sort, an elementary sorting algorithm that offers an intuitive approach to sort a sequence, giving us a deeper understanding of sorting techniques. While it's criticized for its inefficiency, it serves as an excellent educational tool, demonstrating the importance of algorithmic complexity in real-world applications.

We'll start by exploring the problem statement that leads to the application of the Bubble Sort algorithm, followed by a lucid explanation of the algorithm, gradually increasing the technical depth to suit both beginners and advanced readers. We'll discuss the time and space complexity and the modifications that can improve the algorithm's efficiency. We'll also delve into a C++ implementation of the Bubble Sort, and create a bridge to the concept of Inversion Count in an array, thereby connecting the algorithm to a practical problem in computer science.

Through the article, we aim to keep readers engaged with a balance of conceptual explanation and technical detail. Remember, despite its simplicity, Bubble Sort provides a foundation for understanding more advanced algorithms. Thus, embarking on this journey will equip you with the necessary tools to comprehend and analyze more complex sorting algorithms in your future endeavours.

1. Problem Statement

Imagine you're a computer science professor preparing for an upcoming exam. You have a list of student names that you'd like to organize alphabetically. As a computer scientist, your instinct is to approach this problem algorithmically. The solution you seek is an algorithm to sort this list efficiently and effectively. The first algorithm that often comes to mind, especially for small datasets, is Bubble Sort.

2. Bubble Sort: Definition and Basics

Bubble Sort is an uncomplicated and intuitive sorting algorithm, often introduced as the first sorting algorithm in computer science curriculums. This algorithm works by repeatedly stepping through the list, comparing each pair of adjacent items and swapping them if they are in the wrong order. This pass through the list is repeated until no swaps are needed, which indicates that the list is sorted.

The algorithm gets its name because, with each iteration through the list, the largest unsorted element "bubbles up" to its correct position.

3. Visualizing the Bubble Sort Algorithm

Let's consider a simple array of five elements: [5, 3, 8, 4, 2]. Applying Bubble Sort on this array would proceed as follows:


First Pass:
[5, 3, 8, 4, 2] -> Compare 5 and 3. Swap because 5 > 3.
[3, 5, 8, 4, 2] -> Compare 5 and 8. No swap.
[3, 5, 8, 4, 2] -> Compare 8 and 4. Swap because 8 > 4.
[3, 5, 4, 8, 2] -> Compare 8 and 2. Swap because 8 > 2.
[3, 5, 4, 2, 8] -> End of first pass. The largest element 8 has bubbled up to its correct position.

Second Pass:
[3, 5, 4, 2, 8] -> Compare 3 and 5. No swap.
[3, 5, 4, 2, 8] -> Compare 5 and 4. Swap because 5 > 4.
[3, 4, 5, 2, 8] -> Compare 5 and 2. Swap because 5 > 2.
[3, 4, 2, 5, 8] -> End of second pass. The second largest element 5 has bubbled up to its correct position.

Subsequent passes continue until no more swaps are needed.
Final sorted array: [2, 3, 4, 5, 8].

The visual depiction above demonstrates how Bubble Sort manipulates the array to bring it into a sorted state.

4. Bubble Sort Algorithm Implementation in C++

Implementing Bubble Sort in C++ is straightforward, aligning directly with the algorithm's conceptual explanation.


#include<iostream>
using namespace std;

void bubbleSort(int arr[], int n) {
    for(int i = 0; i < n-1; i++) {      // For each element
        for(int j = 0; j < n-i-1; j++) {  // For each pair
            if(arr[j] > arr[j+1]) {     // Compare and swap if necessary
                swap(arr[j], arr[j+1]);
            }
        }
    }
}

int main() {
    int arr[] = {5, 3, 8, 4, 2};
    int n = sizeof(arr)/sizeof(arr[0]);
    
    bubbleSort(arr, n);
    
    cout << "Sorted array: \n";
    for(int i = 0; i < n; i++)
        cout << arr[i] << " ";
        
    return 0;
}

In this C++ code, the outer loop traverses each element, while the inner loop traverses the unsorted part of the array. If the current pair is in the wrong order, they are swapped.

5. Time and Space Complexity of Bubble Sort

Bubble Sort, while simple and intuitive, is inefficient for large data sets. In the worst-case scenario (when the input is sorted in reverse order), Bubble Sort takes $O(n^2)$ time, where $n$ is the number of elements in the array. This is because for each of the $n$ elements, we potentially make $n$ comparisons and swaps, leading to a quadratic time complexity.

The best-case scenario occurs when the input is already sorted. In this case, if we enhance the algorithm to stop if no swaps occur during an entire pass, we can achieve a linear time complexity, $O(n)$, because we only need one pass to confirm that the list is sorted.

The space complexity of Bubble Sort is $O(1)$, which means it is an in-place sorting algorithm. It doesn't require any additional space that scales with input size, making it space-efficient.

6. Optimizing Bubble Sort: The Cocktail Shaker Sort

While Bubble Sort itself is not suitable for large datasets due to its $O(n^2)$ time complexity, its basic idea can be enhanced to produce more efficient algorithms. A simple optimization is the Cocktail Shaker Sort, also known as Bidirectional Bubble Sort.

In Bubble Sort, we only bubble the largest elements from left to right; however, in Cocktail Shaker Sort, we alternate between bubbling the largest elements from left to right, and the smallest elements from right to left. This approach can be more efficient if the smallest or largest elements are "far away" from their sorted position.


void cocktailShakerSort(int arr[], int n) {
    bool swapped = true;
    int start = 0, end = n-1;
  
    while (swapped) {
        swapped = false;
  
        // bubble the largest element to the end
        for (int i = start; i < end; ++i) {
            if (arr[i] > arr[i + 1]) {
                swap(arr[i], arr[i+1]);
                swapped = true;
            }
        }
  
        if (!swapped) // if no two elements were swapped, the array is sorted
            break;
  
        swapped = false;
  
        // decrease the end pointer since the end of the array is sorted
        --end;
  
        // bubble the smallest element to the start
        for (int i = end - 1; i >= start; --i) {
            if (arr[i] > arr[i + 1]) {
                swap(arr[i], arr[i+1]);
                swapped = true;
            }
        }
  
        // increase the start pointer since the start of the array is sorted
        ++start;
    }
}

This version of Bubble Sort can sometimes be faster for specific types of data, but it still retains a worst-case time complexity of $O(n^2)$.

7. Bubble Sort and Inversion Count

To further connect Bubble Sort to more advanced concepts in computer science, we can consider the problem of counting inversions in an array. An inversion is a pair of elements in an array that are out of order. Specifically, a pair $(i, j)$ is called an inversion if $i < j$ and $A[i]> A[j]$.

The number of inversions can be an indicator of how sorted or unsorted an array is. A sorted array has zero inversions, while a reversely sorted array has the maximum number of inversions, $\frac{n(n-1)}{2}$. The concept of inversions is often used in computer science, including in the calculation of the Kendall tau distance (a metric to measure the dissimilarity between two ranked lists) and in collaborative filtering (a method used by recommendation systems).

In Bubble Sort, every time we swap two elements, we essentially eliminate one inversion. Therefore, the number of swaps equals the number of inversions in the original array, which explains why Bubble Sort runs faster on nearly sorted data (few inversions) and slower on reversely sorted data (many inversions).

8. Unravelling the Complexity: The Bigger Picture

As our exploration of Bubble Sort comes to an end, it's essential to reflect on what we've learned. We delved into Bubble Sort, not just as an elementary sorting algorithm, but as a springboard into deeper waters of algorithm design, time-space complexity, and algorithmic optimization.

Bubble Sort might seem deceptively simple, but it is powerful. Its simplicity allows us to connect the abstract concept of sorting with concrete coding practices and to foster a deeper understanding of algorithmic complexity. Its basic idea can be adapted into more efficient algorithms such as QuickSort and MergeSort.

As we continue this journey, we'll next take these principles and explore more complex and efficient sorting algorithms. We'll see how they build upon the foundations laid by Bubble Sort and other simple sorting algorithms, taking advantage of advanced data structures and divide-and-conquer strategies to achieve superior efficiency.