Back to blog
Dec 16, 2024
7 min read

Greedy Algorithms Guide

From basics to advanced techniques, this guide covers everything you need to know about greedy algorithms.

A greedy algorithm makes locally optimal choices at each step, hoping these choices lead to a globally optimal solution. While they don’t always yield the optimal solution, when they do work, they’re usually simpler and more efficient than other approaches.

When to Use Greedy Algorithms

Greedy algorithms are suitable when a problem has these properties:

  1. Greedy Choice Property: This is the heart of any greedy algorithm. When we make a choice, we make the one that looks best right now, without worrying about future consequences. Importantly, we never go back on our choices. Understanding when this approach works is key to mastering greedy algorithms.
  2. Optimal Substructure: This property means that if we break our problem into smaller pieces, the optimal solution to our main problem contains optimal solutions to these smaller subproblems. It’s like building a wall brick by brick, where each brick must be placed perfectly for the whole wall to be perfect.

Pattern 1: Scheduling/Interval Problems

When dealing with intervals, we typically need to decide:

  • What to sort by (start time, end time, or duration)
  • How to handle overlaps
  • What metric to optimize (number of intervals, coverage, etc.)
  1. Merging Overlapping Intervals
def merge_intervals(intervals):
    """
    Merges overlapping intervals to create non-overlapping intervals.
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(n) for the output array
    """
    if not intervals:
        return []
        
    # Sort by start time to process intervals in chronological order
    intervals.sort(key=lambda x: x[0])
    
    merged = [intervals[0]]
    
    for current in intervals[1:]:
        previous = merged[-1]
        
        # If current interval overlaps with previous, merge them
        if current[0] <= previous[1]:
            previous[1] = max(previous[1], current[1])
        else:
            merged.append(current)
    
    return merged
  1. Non-overlapping Intervals
def non_overlapping_intervals(intervals):
    """
    Finds the minimum number of intervals needed to remove to make all intervals non-overlapping.
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(1) since we're not using extra space
    """
    if not intervals:
        return 0
        
    # Sort by end time for optimal scheduling
    intervals.sort(key=lambda x: x[1])
    
    non_overlapping = 1
    end = intervals[0][1]
    
    for i in range(1, len(intervals)):
        if intervals[i][0] >= end:
            non_overlapping += 1
            end = intervals[i][1]
    
    return non_overlapping
  1. Meeting Rooms Required
def min_meeting_rooms(intervals):
    """
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(n) for the start and end times
    """
    if not intervals:
        return 0
        
    # Separate start and end times
    starts = sorted([i[0] for i in intervals])
    ends = sorted([i[1] for i in intervals])
    
    rooms = 0
    max_rooms = 0
    s = e = 0
    
    while s < len(intervals):
        if starts[s] < ends[e]:
            rooms += 1
            s += 1
        else:
            rooms -= 1
            e += 1
        max_rooms = max(max_rooms, rooms)
    
    return max_rooms

Pattern 2: Activity Selection

Key Concepts

Choose maximum number of activities that don’t conflict with each other.

def activity_selection(start, finish):
    """
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(n) for the selected activities
    """
    n = len(start)
    # Sort activities by finish time
    activities = sorted(zip(start, finish), key=lambda x: x[1])
    
    selected = [activities[0]]
    last_finish = activities[0][1]
    
    for i in range(1, n):
        if activities[i][0] >= last_finish:
            selected.append(activities[i])
            last_finish = activities[i][1]
    
    return selected

Pattern 3: Fractional Knapsack

Key Concepts

Unlike 0/1 knapsack, items can be broken into fractions to maximize value.

def fractional_knapsack(items, capacity):
    """
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(1) since we're not using extra space
    """
    # Sort items by value/weight ratio
    items.sort(key=lambda x: x[1]/x[0], reverse=True)
    total_value = 0
    
    for weight, value in items:
        if capacity >= weight:
            capacity -= weight
            total_value += value
        else:
            total_value += value * (capacity/weight)
            break
            
    return total_value

Pattern 4: Huffman Encoding

Key Concepts

Create optimal prefix codes for character encoding based on frequency.

class Node:
    def __init__(self, char, freq):
        self.char = char
        self.freq = freq
        self.left = None
        self.right = None

def build_huffman_tree(chars, freqs):
    """
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(n) for the nodes
    """
    nodes = [Node(c, f) for c, f in zip(chars, freqs)]
    while len(nodes) > 1:
        # Extract two minimum frequency nodes
        nodes.sort(key=lambda x: x.freq)
        left = nodes.pop(0)
        right = nodes.pop(0)
        
        # Create internal node
        internal = Node(None, left.freq + right.freq)
        internal.left = left
        internal.right = right
        nodes.append(internal)
    
    return nodes[0]

Pattern 5: Minimum Spanning Tree

Key Concepts

Find a subset of edges that connects all vertices with minimum total weight.

Kruskal’s Algorithm

class UnionFind:
    def __init__(self, size):
        self.parent = list(range(size))
        self.rank = [0] * size
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]
    
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def kruskal_mst(n, edges):
    """
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(n) for the union-find data structure
    """
    edges.sort(key=lambda x: x[2])  # Sort by weight
    uf = UnionFind(n)
    mst = []
    cost = 0
    
    for u, v, w in edges:
        if uf.union(u, v):
            mst.append((u, v))
            cost += w
            
    return mst, cost

Pattern 6: Job Sequencing with Deadlines

Key Concepts

Schedule jobs to maximize profit while meeting deadlines.

def job_sequencing(jobs):
    """
    Time Complexity: O(n^2) due to nested loops
    Space Complexity: O(n) for the slot array
    """
    # jobs: list of (profit, deadline) tuples
    jobs.sort(key=lambda x: x[0], reverse=True)
    max_deadline = max(job[1] for job in jobs)
    slot = [-1] * max_deadline
    profit = 0
    
    for job_profit, deadline in jobs:
        for j in range(deadline-1, -1, -1):
            if slot[j] == -1:
                slot[j] = job_profit
                profit += job_profit
                break
                
    return profit

Pattern 7: Egyptian Fraction

Key Concepts

Represent a fraction as sum of unique unit fractions.

def egyptian_fraction(nr, dr):
    """
    An egyptian fraction is a sum of distinct unit fractions.
    e.g. 4/7 = 1/2 + 1/14
    Time Complexity: O(dr) for the while loop
    Space Complexity: O(1) since we're not using extra space
    """
    result = []
    while nr != 0:
        unit_fraction = math.ceil(dr/nr)
        result.append(unit_fraction)
        nr = nr * unit_fraction - dr
        dr = dr * unit_fraction
        
    return result

Pattern 9: Platform/Train Problem

Key Concepts

Find minimum number of platforms needed for trains/minimum waiting time.

def min_platforms(arrivals, departures):
    """
    e.g. arrivals = [900, 940, 950, 1100, 1500, 1800]
         departures = [910, 1200, 1120, 1130, 1900, 2000]
    output: 3 (minimum platforms needed)
    Time Complexity: O(n log n) due to sorting
    Space Complexity: O(1) since we're not using extra space
    """
    arrivals.sort()
    departures.sort()
    
    platforms = 0
    max_platforms = 0
    i = j = 0
    
    while i < len(arrivals):
        if arrivals[i] < departures[j]:
            platforms += 1
            i += 1
        else:
            platforms -= 1
            j += 1
        max_platforms = max(max_platforms, platforms)
        
    return max_platforms

Pattern 10: Gas Station Circuit

Key Concepts

Find starting point to complete circuit with given gas and cost arrays.

def can_complete_circuit(gas, cost):
    if sum(gas) < sum(cost):
        return -1
        
    start = 0
    tank = 0
    
    for i in range(len(gas)):
        tank += gas[i] - cost[i]
        if tank < 0:
            start = i + 1
            tank = 0
            
    return start

Pattern 11: Load Balancing

Load balancing problems involve distributing work or resources across multiple units while minimizing some cost metric. This pattern is especially relevant in system design and scheduling.

def load_balance(tasks, k_workers):
    """
    Distributes tasks among k workers to minimize maximum load.
    Uses greedy approach of assigning each task to least loaded worker.
    e.g. tasks = [3, 1, 7, 2, 4, 5, 6], k_workers = 3
    output: 8 (maximum load among all workers)
    """
    # Initialize workers with zero load
    workers = [0] * k_workers
    
    # Sort tasks in descending order for better distribution
    tasks.sort(reverse=True)
    
    for task in tasks:
        # Find worker with minimum current load
        min_worker = workers.index(min(workers))
        workers[min_worker] += task
    
    return max(workers)  # Return maximum load across all workers

Pattern 12: Token/Coverage Problems

These problems involve optimizing the placement of elements to cover or satisfy certain conditions with minimum resources.

def min_tokens_to_cover(text, tokens):
    """
    Find minimum tokens needed to cover all characters in text.
    Uses greedy approach to choose token that covers maximum characters.
    e.g. text = "abcde", tokens = ["ab", "bc", "cd", "de"]
    output: 2 (minimum tokens needed to cover all characters)
    """
    tokens.sort(key=len, reverse=True)  # Sort by length for greedy choice
    covered = set()
    min_tokens = 0
    
    for token in tokens:
        if not set(token).issubset(covered):
            min_tokens += 1
            covered.update(token)
    
    return min_tokens

Pattern 13: Sliding Window with Greedy Decisions

This pattern combines the sliding window technique with greedy decision-making, particularly useful for substring and subarray problems that require optimization.

def max_vowels_substring(s, k):
    """
    Find substring of length k with maximum vowels using sliding window
    with greedy decisions.
    e.g. s = "abciiidef", k = 3
    output: 3 (maximum vowels in any substring of length 3)
    """
    vowels = set('aeiou')
    current_count = sum(1 for c in s[:k] if c in vowels)
    max_count = current_count
    
    for i in range(k, len(s)):
        # Greedily update window by removing first char and adding new char
        if s[i-k] in vowels:
            current_count -= 1
        if s[i] in vowels:
            current_count += 1
        max_count = max(max_count, current_count)
    
    return max_count

Pattern 14: Greedy Graph Traversal

Graph traversal problems often benefit from greedy approaches when we need to optimize paths or coverage. Let’s explore this with a visualization.

def max_vowels_substring(s, k):
    """
    Find substring of length k with maximum vowels using sliding window
    with greedy decisions.
    """
    vowels = set('aeiou')
    current_count = sum(1 for c in s[:k] if c in vowels)
    max_count = current_count
    
    for i in range(k, len(s)):
        # Greedily update window by removing first char and adding new char
        if s[i-k] in vowels:
            current_count -= 1
        if s[i] in vowels:
            current_count += 1
        max_count = max(max_count, current_count)
    
    return max_count

Pattern 14: Greedy Graph Traversal

Graph traversal problems often benefit from greedy approaches when we need to optimize paths or coverage. Let’s explore this with a visualization.

Visualization Suggestion

For graph algorithms like Dijkstra’s, we can use a dynamic visualization showing how the algorithm makes greedy choices:

graph TD
    A((1)) --> B((2))
    A --> C((3))
    B --> D((4))
    C --> D
    style A fill:#4CAF50
    style B fill:#81C784
    style C fill:#81C784
    style D fill:#C8E6C9
def optimized_dijkstra(graph, start):
    """
    Optimized implementation of Dijkstra's algorithm using a priority queue
    for greedy path selection.
    """
    from heapq import heappush, heappop
    distances = {vertex: float('infinity') for vertex in graph}
    distances[start] = 0
    pq = [(0, start)]
    
    while pq:
        current_distance, current_vertex = heappop(pq)
        
        # Skip if we've found a better path
        if current_distance > distances[current_vertex]:
            continue
            
        for neighbor, weight in graph[current_vertex].items():
            distance = current_distance + weight
            
            # Greedily update if we find a shorter path
            if distance < distances[neighbor]:
                distances[neighbor] = distance
                heappush(pq, (distance, neighbor))
    
    return distances

Pattern 15: Greedy String Construction

String manipulation problems often have efficient greedy solutions, especially when dealing with palindromes or pattern matching.

Visualization Suggestion

For string construction problems, we can visualize the process using a decision tree:

graph TD
    A["Initial String"] --> B["Choice 1"]
    A --> C["Choice 2"]
    B --> D["Result 1"]
    C --> E["Result 2"]
    style A fill:#E3F2FD
    style B fill:#90CAF9
    style C fill:#90CAF9
    style D fill:#1976D2
    style E fill:#1976D2
def construct_smallest_palindrome(s):
    """
    Greedily constructs the lexicographically smallest palindrome
    from a given string by making optimal character choices.
    """
    chars = list(s)
    left, right = 0, len(s) - 1
    
    while left < right:
        if chars[left] != chars[right]:
            # Greedily choose the smaller character
            if chars[left] < chars[right]:
                chars[right] = chars[left]
            else:
                chars[left] = chars[right]
        left += 1
        right -= 1
    
    return ''.join(chars)

Visual Thinking in Greedy Algorithms

When approaching greedy problems, visual thinking can be incredibly helpful. Here are some key visualization techniques:

1. Decision Trees

For problems with multiple choice points, draw a tree where each level represents a decision:

graph TD
    A[Start] --> B[Choice 1]
    A --> C[Choice 2]
    B --> D[Outcome 1]
    B --> E[Outcome 2]
    C --> F[Outcome 3]
    C --> G[Outcome 4]
    style A fill:#E8F5E9
    style B,C fill:#A5D6A7
    style D,E,F,G fill:#66BB6A

2. State Transition Diagrams

For problems involving state changes or transformations:

stateDiagram-v2
    [*] --> State1
    State1 --> State2: Action1
    State2 --> State3: Action2
    State3 --> [*]: Complete

3. Flow Networks

For problems involving resource allocation or flow:

graph LR
    A((Source)) --> B((Node1))
    A --> C((Node2))
    B --> D((Sink))
    C --> D
    style A fill:#E1F5FE
    style D fill:#B3E5FC

Implementation Tips with Visualization

When implementing greedy algorithms, consider these visualization-aided approaches:

  1. Draw the Problem State: Before coding, sketch out the initial state, possible transitions, and desired end state.

  2. Identify Decision Points: Mark critical junctures where greedy choices must be made.

  3. Track Progress Visually: Use visual markers to track algorithm progress, especially useful in debugging.

def visualize_greedy_progress(state, choices_made):
    """
    Helper function to visualize algorithm progress
    """
    visualization = []
    for step, (state, choice) in enumerate(choices_made):
        visualization.append(f"Step {step}: {state} -> Chose {choice}")
    return "\n".join(visualization)

Advanced Optimization Techniques

Understanding how to optimize greedy algorithms often benefits from visual analysis:

Memory Usage Optimization

Consider this visual representation of space complexity:

graph TD
    A[Input] --> B[Processing]
    B --> C[Output]
    subgraph Memory Usage
        D[Required]
        E[Optional]
        F[Temporary]
    end
    style Memory Usage fill:#E8F5E9

Time Complexity Visualization

Different greedy approaches can be compared visually:

graph LR
    A[Basic] --> B[O(n²)]
    C[Optimized] --> D[O(n log n)]
    E[Best] --> F[O(n)]
    style A,C,E fill:#E3F2FD
    style B,D,F fill:#BBDEFB

Common Pitfalls and Tips

  1. Always Verify Greedy Choice

    • Prove that local optimal choice leads to global optimal solution
    • Use counter-examples to disprove if necessary
  2. Consider Edge Cases

    • Empty input
    • Single element
    • All elements same
    • Negative numbers
    • Maximum/minimum possible values
  3. Optimization Techniques

    • Sort input when order doesn’t matter
    • Use heap for dynamic minimum/maximum
    • Use counting sort for limited range integers
    • Maintain running variables instead of arrays when possible
  4. Time Complexity Analysis

    • Most greedy algorithms run in O(n log n) due to initial sorting
    • Some can achieve O(n) with counting sort or heap
    • Space complexity often O(1) or O(n)

Problem-Solving Framework

  1. Identify if greedy approach is applicable

    • Can you make locally optimal choices?
    • Do these choices lead to global optimum?
  2. Determine the greedy choice

    • What’s the criteria for making each decision?
    • How to efficiently find the best choice?
  3. Prove correctness

    • Use exchange argument
    • Show that any better solution can be transformed to your solution
  4. Optimize implementation

    • Choose appropriate data structures
    • Consider space-time tradeoffs
    • Handle edge cases