Binary Search Tree & Traversals: CSU1051 - Shoolini U

Binary Search Tree

Executive Summary: Traversing the Binary Search Trees

Within the expansive landscape of Data Structures and Algorithms, Binary Search Trees (BSTs) emerge as an elegant and efficient structure that endows the power of logarithmic time complexity. BSTs, offering O(logn) lookup and insertion times, facilitate faster data access in comparison to linear data structures such as arrays and linked lists. We will traverse the realm of BSTs, exploring in-depth its construction, operations, types, time complexities, implementation in C++, and applications. Connecting BSTs with Object-Oriented Programming, we'll see how encapsulation can further the BSTs' efficiency. For those seeking to grasp the intricate workings of BSTs, be prepared for a comprehensive exploration that commences with the basics and gradually delves into its technical complexities, adequate for PhD scholars.

1. Introduction to Binary Search Trees

Imagine you have a colossal collection of books and are consistently struggling to retrieve a specific book. One way to solve this problem is by implementing a Binary Search Tree (BST), where each book corresponds to a node, and they are organized based on a specific parameter like the book's title. This organization allows quicker access, thereby solving the retrieval problem.

A Binary Search Tree is a node-based binary tree data structure where each node contains a key and two subtrees, the left and right. The key in each node must be greater than all keys stored in its left subtree and less than all keys in its right subtree. This property makes BSTs a sorted binary tree, allowing efficient in-order traversal.

2. Constructing Binary Search Trees

The construction of a BST follows specific rules, crucial for maintaining the property of sorted binary trees. A new key is always inserted at the leaf, ensuring that the key in a node remains larger than the keys in its left subtree and smaller than the keys in its right subtree. Let's explore this process with a greater technical insight.

2.1 Insertion in BSTs

When a new key is to be inserted, starting from the root, we move towards the left if the new key is less and right if more until we encounter a null. At this point, the new key is inserted. This recursive property of BSTs maintains the order property and facilitates logarithmic insertion time, significantly enhancing efficiency in large data sets.

// C++ code for insertion in BST
struct Node* insert(struct Node* node, int key) {
    /* If the tree is empty, assign a new node address to root */
    if (node == NULL) return newNode(key);

    /* Else, recur down the tree */
    if (key < node->key)
        node->left = insert(node->left, key);
    else if (key > node->key)
        node->right = insert(node->right, key);

    /* return the (unchanged) node pointer */
    return node;
}

3. Operations on Binary Search Trees

BSTs offer a multitude of operations such as Search, Insertion, Deletion, and Traversal, each uniquely tuned to utilize BST's properties. Let's probe these operations, unraveling the precise algorithms and complexity analysis involved.

3.1 Search Operation on BSTs

The search operation in a BST involves navigating from the root to either the left or the right subtree based on the value of the key. This operation employs the sorted binary tree property of BSTs, resulting in an average and worst-case time complexity of O(logn).

// C++ code for search operation in BST
struct Node* search(struct Node* root, int key)
{
    // Base Cases: root is null or key is present at root
    if (root == NULL || root->key == key)
       return root;

    // Key is greater than root's key
    if (root->key < key)
       return search(root->right, key);

    // Key is smaller than root's key
    return search(root->left, key);
}

3.2 Deletion Operation on BSTs

The deletion operation in a BST involves more complexity than the other operations. After locating the node to be deleted, we have to consider three cases: deletion of a leaf node, deletion of a node with one child, and deletion of a node with two children. For the third case, we need to find the in-order successor or predecessor to maintain the BST property after deletion.

// C++ code for deletion operation in BST
Node* minValueNode(Node* node) {
    Node* current = node;
 
    /* loop down to find the leftmost leaf */
    while (current && current->left != NULL)
        current = current->left;
 
    return current;
}

Node* deleteNode(Node* root, int key) {
    // base case
    if (root == NULL) return root;

    // If the key to be deleted is smaller than the root's key,
    // then it lies in left subtree
    if (key < root->key)
        root->left = deleteNode(root->left, key);

    // If the key to be deleted is greater than the root's key,
    // then it lies in right subtree
    else if (key > root->key)
        root->right = deleteNode(root->right, key);

    // if key is same as root's key, then This is the node to be deleted
    else {
        // node with only one child or no child
        if (root->left == NULL) {
            Node *temp = root->right;
            free(root);
            return temp;
        }
        else if (root->right == NULL) {
            Node *temp = root->left;
            free(root);
            return temp;
        }

        // node with two children: Get the inorder successor (smallest
        // in the right subtree)
        Node* temp = minValueNode(root->right);

        // Copy the inorder successor's content to this node
        root->key = temp->key;

        // Delete the inorder successor
        root->right = deleteNode(root->right, temp->key);
    }
    return root;
}

3.3 Traversal Methods on BSTs

Traversal methods such as In-Order, Pre-Order, and Post-Order are vital operations for BSTs, often used for printing contents, checking validity, or performing an operation on each node. These methods follow a left-root-right pattern for In-Order, root-left-right for Pre-Order, and left-right-root for Post-Order traversal.

// C++ code for traversal methods in BST
void printInorder(struct Node* node)
{
     if (node == NULL)
          return;

     /* first recur on left child */
     printInorder(node->left);

     /* then print the data of node */
     cout << node->data << " ";

     /* now recur on right child */
     printInorder(node->right);
}

void printPostorder(struct Node* node)
{
     if (node == NULL)
        return;

     // first recur on left subtree
     printPostorder(node->left);

     // then recur on right subtree
     printPostorder(node->right);

     // now deal with the node
     cout << node->data << " ";
}

void printPreorder(struct Node* node)
{
     if (node == NULL)
          return;

     /* first print data of node */
     cout << node->data << " ";

     /* then recur on left subtree */
     printPreorder(node->left);  

     /* now recur on right subtree */
     printPreorder(node->right);
}

4. Types of Binary Search Trees

While the standard Binary Search Tree remains highly useful, certain variations enhance the efficiency in specific scenarios. Let's explore these specialized versions: Balanced BSTs, AVL Trees, Red-Black Trees, and B-Trees.

4.1 Balanced Binary Search Trees

A Balanced Binary Search Tree is a BST where the height of the left and right subtrees of any node differ by at most one. This balance minimizes any skewing, ensuring that operations do not degrade to linear time, keeping it at logarithmic time complexity.

4.2 AVL Trees

AVL Tree, named after inventors Adelson-Velsky and Landis, is a self-balancing BST where the difference between heights of left and right subtrees (Balance Factor) is never more than one for all nodes. Whenever this balance is disrupted, a rotation operation is performed to restore the balance, ensuring optimal time complexity.

4.3 Red-Black Trees

Red-Black Tree is a type of self-balancing BST where every node contains an extra bit representing color, used to ensure the tree remains approximately balanced during updates. This balance helps the tree maintain its search and update times close to O(logn), making it efficient for various operations.

4.4 B-Trees

A B-Tree is a self-balancing search tree with variable but often large number of children per node. Primarily used for storage systems where read and write operations are expensive, B-Trees maintain a balance and a low height, significantly enhancing efficiency.

5. Binary Search Trees and Object-Oriented Programming

When considering the application of BSTs in Object-Oriented Programming (OOP) with C++, we can encapsulate a BST within a class, thereby abstracting its implementation and promoting code reusability and maintainability. This encapsulation allows us to easily extend the BST's functionality, facilitating seamless integration with other parts of a system, and paving the way for more advanced structures like Maps and Sets.

// C++ code for BST class
class BST {
   struct Node {
      int data;
      Node* left;
      Node* right;
   };

   Node* root;

   // private member functions
   Node* insert(int, Node*);
   Node* deleteNode(int, Node*);
   void inorder(Node*);
   // additional private member functions

public:
   BST();
   ~BST();

   // public member functions
   void insert(int);
   void remove(int);
   void display();
   // additional public member functions
};

6. Applications of Binary Search Trees

Binary Search Trees, with their sorted nature and logarithmic operation time complexity, find extensive applications in numerous domains. They are pivotal in database systems for indexing purposes, allowing rapid access, addition, and removal of data. In graphics, they aid in hidden surface removal and 3D rendering. In routers, they help with storing routing tables, facilitating rapid packet forwarding. In advanced algorithms like heapsort and quicksort, they contribute significantly to efficiency. Furthermore, BSTs form the backbone of several data structures and abstract data types (ADTs) such as associative arrays, priority queues, sets, and maps. In essence, BSTs are an integral part of data-driven industries and computer science education.

7. Conclusion

Binary Search Trees, striking a balance between functionality and efficiency, offer a practical and powerful approach to data storage and retrieval. With a solid understanding of BSTs, anyone can tap into their potential, propelling them a step ahead in their computer science journey. Whether it's for enhancing a software system's performance, implementing advanced algorithms, or cracking coding interviews, mastering BSTs proves beneficial. This comprehensive guide aimed to bridge the gap between the theoretical and practical aspects of BSTs, fostering a deeper understanding of this elegant data structure.