Tree Structures Lesson 7

Binary Trees & File Systems

intermediate ⏱️ 50 minutes 🌍 Organizing files and folders, database indexing, routing tables

We’ve explored linear data structures like arrays, linked lists, stacks, and queues. Now, we’re stepping into the world of non-linear data structures with Binary Trees. Trees are incredibly powerful for representing hierarchical relationships, and a perfect real-world analogy to understand them is the familiar structure of a computer’s File System.

Binary Tree Structure

A hierarchical data structure where each node has at most two children.

A: Root

B, C: Children of A, Internal Nodes

D, E, F: Leaf Nodes

Edges: Connections between nodes

🌳 What is a Tree (and Binary Tree)?

A tree is a non-linear data structure that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.

A Binary Tree is a special type of tree where each node has at most two children, typically referred to as the left child and the right child.

Key Terminology:

Root: The topmost node of the tree. It has no parent.
Node: Each element in the tree (contains data and pointers to children).
Parent: A node that has one or more children nodes.
Child: A node connected to a parent node moving away from the root.
Sibling: Nodes that share the same parent.
Leaf (External Node): A node that has no children.
Internal Node: A node with at least one child.
Edge: The link or connection between two nodes.
Path: A sequence of nodes and edges connecting one node to another.
Depth of a Node: The length of the path from the root to the node.
Height of a Node: The length of the longest path from the node to a leaf.
Height of a Tree: The height of its root node.

Analogy: A family tree, an organizational chart, or the table of contents in a book.

🌲 Types of Binary Trees

While all binary trees have at most two children per node, we categorize them further based on their structure:

Full Binary Tree: Every node has either 0 or 2 children. No node has only one child.
Complete Binary Tree: All levels are completely filled except possibly the last level, which is filled from left to right.
Perfect Binary Tree: All internal nodes have two children AND all leaf nodes are at the same level. (A perfect binary tree is always full and complete).
Skewed Binary Tree: A tree where every node has only one child (either always left or always right).

Why do these matter? The type of binary tree often impacts the efficiency of operations. For instance, balanced trees (like AVL trees or Red-Black trees, which are types of binary search trees) guarantee O(log N) operations, preventing the tree from degenerating into a linked list (which would lead to O(N) operations).

traversals: Visiting Nodes

Tree traversals are methods for visiting each node in a tree exactly once. The order in which nodes are visited defines the type of traversal. For binary trees, the most common traversals are:

In-order Traversal (Left -> Root -> Right):
- Visit the left subtree.
- Visit the root node.
- Visit the right subtree.
- Useful for: Retrieving elements from a Binary Search Tree in sorted order.
Pre-order Traversal (Root -> Left -> Right):
- Visit the root node.
- Visit the left subtree.
- Visit the right subtree.
- Useful for: Copying a tree, creating a prefix expression (Polish notation).
Post-order Traversal (Left -> Right -> Root):
- Visit the left subtree.
- Visit the right subtree.
- Visit the root node.
- Useful for: Deleting a tree, creating a postfix expression (Reverse Polish notation).

Binary Tree Node and Traversals (Recursive)


        
          
#include <stdio.h>
#include <stdlib.h>

typedef struct TreeNode {
  int data;
  struct TreeNode *left;
  struct TreeNode *right;
} TreeNode;

// Function to create a new tree node
TreeNode* create_node(int data) {
  TreeNode* newNode = (TreeNode*) malloc(sizeof(TreeNode));
  if (newNode == NULL) { /* Handle error */ exit(EXIT_FAILURE); }
  newNode->data = data;
  newNode->left = NULL;
  newNode->right = NULL;
  return newNode;
}

// In-order Traversal (Left -> Root -> Right)
void inorder_traversal(TreeNode* node) {
  if (node == NULL) return;
  inorder_traversal(node->left);
  printf("%d ", node->data);
  inorder_traversal(node->right);
}

// Pre-order Traversal (Root -> Left -> Right)
void preorder_traversal(TreeNode* node) {
  if (node == NULL) return;
  printf("%d ", node->data);
  preorder_traversal(node->left);
  preorder_traversal(node->right);
}

// Post-order Traversal (Left -> Right -> Root)
void postorder_traversal(TreeNode* node) {
  if (node == NULL) return;
  postorder_traversal(node->left);
  postorder_traversal(node->right);
  printf("%d ", node->data);
}

// Main for demonstration
int main() {
  // Example Tree:
  //      1
  //     / \
  //    2   3
  //   / \
  //  4   5
  TreeNode* root = create_node(1);
  root->left = create_node(2);
  root->right = create_node(3);
  root->left->left = create_node(4);
  root->left->right = create_node(5);

  printf("In-order traversal: ");
  inorder_traversal(root); // Expected: 4 2 5 1 3
  printf("\n");

  printf("Pre-order traversal: ");
  preorder_traversal(root); // Expected: 1 2 4 5 3
  printf("\n");

  printf("Post-order traversal: ");
  postorder_traversal(root); // Expected: 4 5 2 3 1
  printf("\n");

  // Free memory (simplified, recursive freeing needed for real apps)
  // For simplicity, not including full recursive free here
  // but in a real tree, you'd typically free in post-order.
  free(root->left->left);
  free(root->left->right);
  free(root->left);
  free(root->right);
  free(root);
  return 0;
}


        
          
class TreeNode:
  def __init__(self, data):
      self.data = data
      self.left = None
      self.right = None

# In-order Traversal (Left -> Root -> Right)
def inorder_traversal(node):
  if node is None:
      return []
  result = []
  result.extend(inorder_traversal(node.left))
  result.append(node.data)
  result.extend(inorder_traversal(node.right))
  return result

# Pre-order Traversal (Root -> Left -> Right)
def preorder_traversal(node):
  if node is None:
      return []
  result = [node.data]
  result.extend(preorder_traversal(node.left))
  result.extend(preorder_traversal(node.right))
  return result

# Post-order Traversal (Left -> Right -> Root)
def postorder_traversal(node):
  if node is None:
      return []
  result = []
  result.extend(postorder_traversal(node.left))
  result.extend(postorder_traversal(node.right))
  result.append(node.data)
  return result

# Main for demonstration
# Example Tree:
#      1
#     / \
#    2   3
#   / \
#  4   5
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)

print(f"In-order traversal: {inorder_traversal(root)}") # Expected: [4, 2, 5, 1, 3]
print(f"Pre-order traversal: {preorder_traversal(root)}") # Expected: [1, 2, 4, 5, 3]
print(f"Post-order traversal: {postorder_traversal(root)}") # Expected: [4, 5, 2, 3, 1]


        
          
class TreeNode {
  constructor(data) {
      this.data = data;
      this.left = null;
      this.right = null;
  }
}

// In-order Traversal (Left -> Root -> Right)
function inorderTraversal(node, result = []) {
  if (node === null) return;
  inorderTraversal(node.left, result);
  result.push(node.data);
  inorderTraversal(node.right, result);
  return result;
}

// Pre-order Traversal (Root -> Left -> Right)
function preorderTraversal(node, result = []) {
  if (node === null) return;
  result.push(node.data);
  preorderTraversal(node.left, result);
  preorderTraversal(node.right, result);
  return result;
}

// Post-order Traversal (Left -> Right -> Root)
function postorderTraversal(node, result = []) {
  if (node === null) return;
  postorderTraversal(node.left, result);
  postorderTraversal(node.right, result);
  result.push(node.data);
  return result;
}

// Main for demonstration
// Example Tree:
//      1
//     / \
//    2   3
//   / \
//  4   5
const root = new TreeNode(1);
root.left = new TreeNode(2);
root.right = new TreeNode(3);
root.left.left = new TreeNode(4);
root.left.right = new TreeNode(5);

console.log("In-order traversal:", inorderTraversal(root));    // Expected: [4, 2, 5, 1, 3]
console.log("Pre-order traversal:", preorderTraversal(root));  // Expected: [1, 2, 4, 5, 3]
console.log("Post-order traversal:", postorderTraversal(root)); // Expected: [4, 5, 2, 3, 1]

Complexity (Tree Traversal):

Time Complexity: O(N) - Each node is visited exactly once.
Space Complexity: O(H) where H is the height of the tree. In the worst case (skewed tree), H can be N (like a linked list), so O(N). In the best case (balanced tree), H is O(log N), so O(log N). This is due to the recursive call stack.

Note: Recursion is a natural fit for tree traversals because of the self-similar (recursive) definition of a tree (a tree is a root connected to sub-trees). However, iterative versions using an explicit stack are also possible and can be necessary to avoid stack overflow for very deep trees.

📂 File Systems as Trees

A computer’s file system is a classic and intuitive example of a hierarchical tree structure.

Root Directory: / (Unix-like) or C:\ (Windows) acts as the root of the tree.
Directories/Folders: Act as internal nodes, containing files or other subdirectories.
Files: Act as leaf nodes, as they typically don’t contain other file system objects.
Paths: The sequence of directories from the root to a specific file or folder forms a path.

Why use a tree structure for file systems?

Organization: Provides a logical and intuitive way to group related files.
Efficient Lookup: While not always a perfect binary search tree, the hierarchical structure allows the operating system to quickly navigate to desired files without scanning every file on the disk.
Scalability: Can manage vast numbers of files and directories efficiently.
Access Control: Permissions can be set at directory levels and inherited, simplifying security.

📁 /
- 📁 home
  - 📁 user
    - 📁 documents
      - 📄 report.pdf
      - 📄 notes.txt
    - 📄 app.js
- 📁 etc
  - 📄 hosts
  - 📄 config.conf

Practical Usages (Beyond File Systems):

Database Indexing: B-trees and B+ trees (variations of trees) are used in databases for efficient data retrieval.
Compilers: Parse code into Abstract Syntax Trees (ASTs).
Networking: Routing algorithms often use tree-like structures.
Artificial Intelligence: Decision trees are used for classification and regression.
Expression Parsers: Represent mathematical expressions.

Note: While most file systems are conceptualized as trees, they can sometimes contain “hard links” or “symbolic links” which introduce cross-references that effectively turn parts of the tree into a graph structure. This adds complexity but also flexibility (e.g., shortcuts to files/folders).