Graph Structures Lesson 12

Graph Basics & Social Networks

intermediate ⏱️ 60 minutes 🌍 Social media friend connections, GPS routing, flight paths, recommendation systems

We’ve covered various linear and hierarchical data structures. Now, we embark on the most flexible and powerful data structure: Graphs. Graphs excel at modeling complex relationships and interconnections between entities, making them indispensable in fields from computer science to biology. A perfect real-world illustration is the intricate web of connections in Social Networks.

Undirected Graph

Vertices (nodes) connected by Edges (links) to model relationships.

Vertices: A, B, C, D, E

Edges: A—B, A—C, B—C, C—D, C—E, D—E

🌐 What is a Graph?

A Graph is a non-linear data structure consisting of a finite set of vertices (or nodes) and a finite set of edges (or links/connections) that connect pairs of vertices.

Vertex (Node): A fundamental entity in a graph. It can represent anything from a person, a city, a website, or a data point.
Edge (Link): A connection between two vertices. Edges can represent relationships like friendship, roads between cities, hyperlinks, or data flow.

Types of Graphs:

Undirected Graph: Edges have no direction. If an edge connects vertex A and vertex B, it means A is connected to B, and B is connected to A. (e.g., A-B is the same as B-A).
- Analogy: Friendships on Facebook (if A is friends with B, B is friends with A).
Directed Graph (Digraph): Edges have a direction. If an edge goes from vertex A to vertex B, it means A has a relationship to B, but not necessarily vice-versa. (e.g., A -> B is different from B -> A).
- Analogy: Following someone on Twitter (you follow them, they don’t necessarily follow you back).

Graph Properties:

Weighted Graph: Each edge has an associated value (weight or cost). (e.g., distance between cities, cost of a flight).
Unweighted Graph: All edges have the same implied weight (or no weight at all).
Cyclic Graph: Contains at least one cycle (a path that starts and ends at the same vertex).
Acyclic Graph: Contains no cycles. (e.g., a tree is a special type of acyclic graph).

🏗️ Graph Representations

How we store a graph in memory impacts the efficiency of different operations. The two most common ways are:

1. Adjacency Matrix

A 2D array (matrix) of size V x V (where V is the number of vertices).
matrix[i][j] = 1 (or weight) if there’s an edge from vertex i to vertex j.
matrix[i][j] = 0 (or infinity) if there’s no edge.
For an undirected graph, the matrix is symmetric (matrix[i][j] == matrix[j][i]).

2. Adjacency List

An array (or hash map) where each index (or key) represents a vertex.
Each entry in the array points to a list (e.g., linked list, dynamic array) of its adjacent vertices.
For an undirected graph, if A is connected to B, then B will be in A’s list, and A will be in B’s list.

Graph Representations

Visualizing a sample graph and its Adjacency Matrix and Adjacency List.

Sample Graph (undirected)

Adjacency Matrix (O(V²))

0 1 2

0 0 5 2

1 5 0 3

2 2 3 0

Efficient for dense graphs, check edge in O(1).

Adjacency List (O(V+E))

1(5) , 2(2)

0(5) , 2(3)

0(2) , 1(3)

Efficient for sparse graphs, check edge in O(deg(u)).

Complexity of Representations:

Operation	Adjacency Matrix	Adjacency List
Space	O(V^2)	O(V + E) (V vertices, E edges)
Add Edge	O(1)	O(1) (append to list)
Check Edge (u,v)	O(1)	O(deg(u)) (degree of u)
Get Neighbors	O(V) (iterate row)	O(deg(u))
Sparse Graphs	Inefficient (lots of 0s)	Efficient
Dense Graphs	Efficient	Efficient

Interesting Note: The choice between an adjacency matrix and an adjacency list often depends on the density of the graph (number of edges relative to possible edges).

For dense graphs (many edges, E approaches V^2), an adjacency matrix is suitable.
For sparse graphs (few edges, E is much smaller than V^2), an adjacency list is much more space-efficient and often faster for traversal-based algorithms. Social networks are typically very sparse.

traversals: Exploring a Graph

Graph traversals are algorithms for visiting all vertices in a graph systematically. The two most common are:

1. Breadth-First Search (BFS)

Starts at a given source vertex and explores all of its immediate neighbors first, before moving to the next level of neighbors.
Uses a Queue to manage vertices to visit next (FIFO).
Analogy: Ripples in a pond. The wave expands outwards evenly from the source.
Useful for: Finding the shortest path in an unweighted graph, finding all nodes within k distance.

2. Depth-First Search (DFS)

Starts at a given source vertex and explores as far as possible along each branch before backtracking.
Uses a Stack (implicitly via recursion, or explicitly) to keep track of vertices to visit next (LIFO).
Analogy: Exploring a maze. You go down one path as far as you can before hitting a dead end and backtracking.
Useful for: Detecting cycles, topological sorting, finding connected components.

Graph Representation (Adjacency List) & Traversals


        
          
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h> // For bool

#define MAX_VERTICES 5 // Example graph size

// Linked list node for adjacency list
typedef struct AdjNode {
  int dest;
  struct AdjNode* next;
} AdjNode;

// Structure for a graph (array of adjacency lists)
typedef struct Graph {
  AdjNode* adj[MAX_VERTICES];
  bool visited[MAX_VERTICES];
} Graph;

// Create a new adjacency list node
AdjNode* new_adj_node(int dest) {
  AdjNode* newNode = (AdjNode*) malloc(sizeof(AdjNode));
  if (newNode == NULL) { /* Handle error */ exit(EXIT_FAILURE); }
  newNode->dest = dest;
  newNode->next = NULL;
  return newNode;
}

// Initialize graph
void init_graph(Graph* graph) {
  for (int i = 0; i < MAX_VERTICES; i++) {
      graph->adj[i] = NULL;
      graph->visited[i] = false;
  }
}

// Add an edge to an undirected graph
void add_edge(Graph* graph, int src, int dest) {
  // Add edge from src to dest
  AdjNode* newNode = new_adj_node(dest);
  newNode->next = graph->adj[src];
  graph->adj[src] = newNode;

  // Add edge from dest to src (for undirected graph)
  newNode = new_adj_node(src);
  newNode->next = graph->adj[dest];
  graph->adj[dest] = newNode;
}

// --- DFS Traversal (Recursive) ---
void dfs(Graph* graph, int vertex) {
  graph->visited[vertex] = true;
  printf("%d ", vertex);

  AdjNode* temp = graph->adj[vertex];
  while (temp != NULL) {
      int adjVertex = temp->dest;
      if (!graph->visited[adjVertex]) {
          dfs(graph, adjVertex);
      }
      temp = temp->next;
  }
}

// --- BFS Traversal (using a simple queue simulation) ---
// For a real robust BFS in C, you'd use a proper queue data structure.
// This is a simplified direct array-based queue for illustration.
void bfs(Graph* graph, int startVertex) {
  for (int i = 0; i < MAX_VERTICES; i++) { // Reset visited for BFS
      graph->visited[i] = false;
  }

  int queue[MAX_VERTICES];
  int front = 0, rear = -1;

  queue[++rear] = startVertex; // Enqueue start vertex
  graph->visited[startVertex] = true;

  printf("BFS from %d: ", startVertex);
  while (front <= rear) {
      int currentVertex = queue[front++]; // Dequeue
      printf("%d ", currentVertex);

      AdjNode* temp = graph->adj[currentVertex];
      while (temp != NULL) {
          int adjVertex = temp->dest;
          if (!graph->visited[adjVertex]) {
              graph->visited[adjVertex] = true;
              queue[++rear] = adjVertex; // Enqueue
          }
          temp = temp->next;
      }
  }
  printf("\n");
}


// Main for demonstration
int main() {
  Graph graph;
  init_graph(&graph);

  // Create a graph (0-indexed)
  // 0 -- 1
  // |  / | \
  // 3 -- 2  4
  add_edge(&graph, 0, 1);
  add_edge(&graph, 0, 3);
  add_edge(&graph, 1, 2);
  add_edge(&graph, 1, 3);
  add_edge(&graph, 1, 4);
  add_edge(&graph, 2, 3);

  printf("DFS from 0: ");
  dfs(&graph, 0); // Expected: 0 1 2 3 4 (order may vary based on adj list insertion)
  printf("\n");

  bfs(&graph, 0); // Expected: 0 1 3 2 4

  // Free memory (simplified)
  for (int i = 0; i < MAX_VERTICES; i++) {
      AdjNode* current = graph.adj[i];
      while (current != NULL) {
          AdjNode* temp = current;
          current = current->next;
          free(temp);
      }
  }

  return 0;
}


        
          
from collections import deque

class Graph:
  def __init__(self, vertices):
      self.V = vertices
      self.adj = [[] for _ in range(vertices)] # Adjacency list
      self.visited = [False] * vertices

  def add_edge(self, u, v):
      self.adj[u].append(v)
      self.adj[v].append(u) # For undirected graph

  def reset_visited(self):
      self.visited = [False] * self.V

  # --- DFS Traversal (Recursive) ---
  def dfs(self, v):
      self.visited[v] = True
      print(v, end=" ")
      for neighbor in self.adj[v]:
          if not self.visited[neighbor]:
              self.dfs(neighbor)

  # --- BFS Traversal ---
  def bfs(self, start_vertex):
      self.reset_visited() # Ensure fresh visited array for BFS
      queue = deque()

      queue.append(start_vertex)
      self.visited[start_vertex] = True

      print(f"BFS from {start_vertex}: ", end="")
      while queue:
          s = queue.popleft() # Dequeue
          print(s, end=" ")

          for neighbor in self.adj[s]:
              if not self.visited[neighbor]:
                  self.visited[neighbor] = True
                  queue.append(neighbor) # Enqueue
      print()

# Main for demonstration
graph = Graph(5)
# 0 -- 1
# |  / | \
# 3 -- 2  4
graph.add_edge(0, 1)
graph.add_edge(0, 3)
graph.add_edge(1, 2)
graph.add_edge(1, 3)
graph.add_edge(1, 4)
graph.add_edge(2, 3)

graph.reset_visited()
print("DFS from 0: ", end="")
graph.dfs(0) # Expected: 0 1 2 3 4 (order may vary)
print()

graph.bfs(0) # Expected: 0 1 3 2 4 (order is consistent for BFS)


        
          
class Graph {
  constructor(vertices) {
      this.V = vertices;
      this.adj = new Array(vertices).fill(null).map(() => []); // Adjacency list
      this.visited = new Array(vertices).fill(false);
  }

  add_edge(u, v) {
      this.adj[u].push(v);
      this.adj[v].push(u); // For undirected graph
  }

  resetVisited() {
      this.visited = new Array(this.V).fill(false);
  }

  // --- DFS Traversal (Recursive) ---
  dfs(v) {
      this.visited[v] = true;
      for (const neighbor of this.adj[v]) {
          if (!this.visited[neighbor]) {
              this.dfs(neighbor);
          }
      }
  }

  // --- BFS Traversal ---
  bfs(start_vertex) {
      this.resetVisited(); // Ensure fresh visited array for BFS
      const queue = [];

      queue.push(start_vertex);
      this.visited[start_vertex] = true;

      while (queue.length > 0) {
          const s = queue.shift(); // Dequeue (O(N) for JS Array.shift, but common for concept)

          for (const neighbor of this.adj[s]) {
              if (!this.visited[neighbor]) {
                  this.visited[neighbor] = true;
                  queue.push(neighbor); // Enqueue
              }
          }
      }
      console.log(); // Newline at end
  }
}

// Main for demonstration
const graph = new Graph(5);
// 0 -- 1
// |  / | \
// 3 -- 2  4
graph.add_edge(0, 1);
graph.add_edge(0, 3);
graph.add_edge(1, 2);
graph.add_edge(1, 3);
graph.add_edge(1, 4);
graph.add_edge(2, 3);

graph.resetVisited();
process.stdout.write("DFS from 0: ");
graph.dfs(0);
console.log(); // Newline at end

graph.bfs(0);

Complexity (Graph Traversals - using Adjacency List):

BFS / DFS:
- Time Complexity: O(V + E) - We visit every vertex and every edge once.
- Space Complexity: O(V) - For the visited array and the queue (for BFS) or stack (for DFS recursive calls).

Note: Graphs are incredibly versatile. Many seemingly different problems, from network routing to game state analysis, can be modeled and solved using graph algorithms. The choice of traversal (BFS vs. DFS) depends on the problem: BFS is good for shortest paths (in unweighted graphs), while DFS is better for exploring a single path as deeply as possible.

Social networks are arguably the most intuitive and widespread real-world application of graph theory.

Vertices (Nodes): Represent individual users (people, accounts, profiles).
Edges (Links): Represent connections or relationships between users.
- Undirected Edge: A mutual relationship, like “friends” on Facebook. If A is friends with B, B is friends with A.
- Directed Edge: A one-way relationship, like “following” on Twitter or Instagram. If A follows B, B doesn’t necessarily follow A back.
Weights (Optional): Edges might have weights representing strength of connection, frequency of interaction, etc.

Graph Concepts in Social Networks:

Friend of a Friend (Shortest Path): “Six degrees of separation” is essentially finding the shortest path between two users in an unweighted graph. BFS is ideal for this.
Community Detection (Connected Components): Finding groups of users who are more densely connected to each other than to the rest of the network.
Influencers (Centrality): Identifying users who are highly connected or central to the network (various centrality measures exist).
Recommendation Systems: “People you might know” or “Products you might like” often leverage graph algorithms to analyze connections and common interests.
Spreading Information/Virality: Modeling how information (or a virus) spreads through a network.

Note: Analyzing vast social network graphs poses significant challenges due to their scale (billions of vertices and trillions of edges). Specialized graph databases (like Neo4j) and distributed graph processing frameworks (like Apache Giraph) are designed to handle such immense graph data. The insights gained from graph analysis in social networks are invaluable for everything from marketing to public health.