Skip to content
Unverified — AI-generated content. Help verify this page

Heaps & Priority Queues

A heap is a complete binary tree that satisfies the heap property: every parent is smaller (min-heap) or larger (max-heap) than its children. A priority queue is the abstract data type; a heap is the most common implementation. Together, they solve a class of problems — Top-K, scheduling, median maintenance, merge K sorted streams — with elegant efficiency.

Heap Fundamentals

Structure

A heap is a complete binary tree stored as an array. For a node at index i (0-indexed):

  • Left child: 2i+1
  • Right child: 2i+2
  • Parent: (i1)/2

Array representation: [1, 3, 2, 7, 6, 5, 4]

Min-Heap vs Max-Heap

PropertyMin-HeapMax-Heap
RootSmallest elementLargest element
Parent vs childrenParent childrenParent children
Extract operationReturns minimumReturns maximum
Use casePriority queues, DijkstraMax priority, heap sort

Operations Complexity

OperationTime
Insert (push)O(logn)
Extract min/max (pop)O(logn)
Peek (get min/max)O(1)
Build heap from arrayO(n)
Search for arbitrary elementO(n)
Delete arbitrary elementO(n) (find) + O(logn) (remove)

Min-Heap Implementation

TypeScript:

typescript
class MinHeap {
  private heap: number[] = [];

  get size(): number {
    return this.heap.length;
  }

  peek(): number | undefined {
    return this.heap[0];
  }

  push(val: number): void {
    this.heap.push(val);
    this.bubbleUp(this.heap.length - 1);
  }

  pop(): number | undefined {
    if (this.heap.length === 0) return undefined;

    const min = this.heap[0];
    const last = this.heap.pop()!;

    if (this.heap.length > 0) {
      this.heap[0] = last;
      this.bubbleDown(0);
    }

    return min;
  }

  private bubbleUp(idx: number): void {
    while (idx > 0) {
      const parent = Math.floor((idx - 1) / 2);
      if (this.heap[parent] <= this.heap[idx]) break;
      [this.heap[parent], this.heap[idx]] = [this.heap[idx], this.heap[parent]];
      idx = parent;
    }
  }

  private bubbleDown(idx: number): void {
    const n = this.heap.length;

    while (true) {
      let smallest = idx;
      const left = 2 * idx + 1;
      const right = 2 * idx + 2;

      if (left < n && this.heap[left] < this.heap[smallest]) smallest = left;
      if (right < n && this.heap[right] < this.heap[smallest]) smallest = right;

      if (smallest === idx) break;

      [this.heap[idx], this.heap[smallest]] = [this.heap[smallest], this.heap[idx]];
      idx = smallest;
    }
  }
}

Python:

python
class MinHeap:
    def __init__(self):
        self.heap: list[int] = []

    def __len__(self) -> int:
        return len(self.heap)

    def peek(self) -> int:
        return self.heap[0]

    def push(self, val: int) -> None:
        self.heap.append(val)
        self._bubble_up(len(self.heap) - 1)

    def pop(self) -> int:
        if len(self.heap) == 1:
            return self.heap.pop()

        min_val = self.heap[0]
        self.heap[0] = self.heap.pop()
        self._bubble_down(0)
        return min_val

    def _bubble_up(self, idx: int) -> None:
        while idx > 0:
            parent = (idx - 1) // 2
            if self.heap[parent] <= self.heap[idx]:
                break
            self.heap[parent], self.heap[idx] = self.heap[idx], self.heap[parent]
            idx = parent

    def _bubble_down(self, idx: int) -> None:
        n = len(self.heap)
        while True:
            smallest = idx
            left = 2 * idx + 1
            right = 2 * idx + 2

            if left < n and self.heap[left] < self.heap[smallest]:
                smallest = left
            if right < n and self.heap[right] < self.heap[smallest]:
                smallest = right

            if smallest == idx:
                break

            self.heap[idx], self.heap[smallest] = self.heap[smallest], self.heap[idx]
            idx = smallest

Python's heapq

In practice, use Python's built-in heapq module — it's a min-heap implemented in C, much faster than a pure Python implementation.

python
import heapq

heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 2)
heapq.heappush(heap, 8)
smallest = heapq.heappop(heap)  # 2

# For a max-heap, negate values:
heapq.heappush(heap, -10)  # pushes 10 as max
largest = -heapq.heappop(heap)  # 10

Build Heap — Why O(n)?

Building a heap by inserting elements one by one is O(nlogn). But building bottom-up (heapifying from the last non-leaf to the root) is O(n).

Intuition: Most nodes are near the bottom of the tree and only need to bubble down a few levels. The sum of work across all levels is:

h=0lognn2h+1O(h)=O(n)h=0h2h+1=O(n)

Python:

python
def build_heap(arr: list[int]) -> None:
    """In-place heapify. O(n) time."""
    n = len(arr)
    # Start from last non-leaf node
    for i in range(n // 2 - 1, -1, -1):
        heapify_down(arr, n, i)

def heapify_down(arr: list[int], size: int, idx: int) -> None:
    smallest = idx
    left = 2 * idx + 1
    right = 2 * idx + 2

    if left < size and arr[left] < arr[smallest]:
        smallest = left
    if right < size and arr[right] < arr[smallest]:
        smallest = right

    if smallest != idx:
        arr[idx], arr[smallest] = arr[smallest], arr[idx]
        heapify_down(arr, size, smallest)

Top-K Problems

Kth Largest Element

Use a min-heap of size K. After processing all elements, the root is the Kth largest.

TypeScript:

typescript
function findKthLargest(nums: number[], k: number): number {
  const heap = new MinHeap();

  for (const num of nums) {
    heap.push(num);
    if (heap.size > k) {
      heap.pop(); // remove smallest — keeps K largest
    }
  }

  return heap.peek()!;
}

Python:

python
import heapq

def find_kth_largest(nums: list[int], k: int) -> int:
    # nlargest is optimized for this exact use case
    return heapq.nlargest(k, nums)[-1]

    # Or manually with a min-heap of size k:
    # heap = []
    # for num in nums:
    #     heapq.heappush(heap, num)
    #     if len(heap) > k:
    #         heapq.heappop(heap)
    # return heap[0]

Complexity: O(nlogk) time, O(k) space. When kn, this is much better than sorting (O(nlogn)).

Top-K Frequent Elements

Python:

python
from collections import Counter
import heapq

def top_k_frequent(nums: list[int], k: int) -> list[int]:
    count = Counter(nums)
    # Use a min-heap of size k based on frequency
    return heapq.nlargest(k, count.keys(), key=count.get)

TypeScript:

typescript
function topKFrequent(nums: number[], k: number): number[] {
  const freq = new Map<number, number>();
  for (const num of nums) {
    freq.set(num, (freq.get(num) || 0) + 1);
  }

  // Bucket sort approach — O(n)
  const buckets: number[][] = Array.from({ length: nums.length + 1 }, () => []);
  for (const [num, count] of freq) {
    buckets[count].push(num);
  }

  const result: number[] = [];
  for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
    result.push(...buckets[i]);
  }

  return result.slice(0, k);
}

Median Finding (Two Heaps)

Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps.

TypeScript:

typescript
class MedianFinder {
  private maxHeap: number[] = []; // lower half (negate for max behavior)
  private minHeap: number[] = []; // upper half

  addNum(num: number): void {
    // Always add to maxHeap first (store as negated)
    this.heapPush(this.maxHeap, -num);

    // Balance: max of lower half should be <= min of upper half
    this.heapPush(this.minHeap, -this.heapPop(this.maxHeap)!);

    // Keep sizes balanced (maxHeap can be 1 larger)
    if (this.minHeap.length > this.maxHeap.length) {
      this.heapPush(this.maxHeap, -this.heapPop(this.minHeap)!);
    }
  }

  findMedian(): number {
    if (this.maxHeap.length > this.minHeap.length) {
      return -this.maxHeap[0];
    }
    return (-this.maxHeap[0] + this.minHeap[0]) / 2;
  }

  private heapPush(heap: number[], val: number): void {
    heap.push(val);
    let i = heap.length - 1;
    while (i > 0) {
      const parent = Math.floor((i - 1) / 2);
      if (heap[parent] <= heap[i]) break;
      [heap[parent], heap[i]] = [heap[i], heap[parent]];
      i = parent;
    }
  }

  private heapPop(heap: number[]): number | undefined {
    if (heap.length === 0) return undefined;
    const top = heap[0];
    const last = heap.pop()!;
    if (heap.length > 0) {
      heap[0] = last;
      let i = 0;
      while (true) {
        let smallest = i;
        const l = 2 * i + 1, r = 2 * i + 2;
        if (l < heap.length && heap[l] < heap[smallest]) smallest = l;
        if (r < heap.length && heap[r] < heap[smallest]) smallest = r;
        if (smallest === i) break;
        [heap[i], heap[smallest]] = [heap[smallest], heap[i]];
        i = smallest;
      }
    }
    return top;
  }
}

Python:

python
import heapq

class MedianFinder:
    def __init__(self):
        self.max_heap: list[int] = []  # lower half (negated)
        self.min_heap: list[int] = []  # upper half

    def add_num(self, num: int) -> None:
        heapq.heappush(self.max_heap, -num)
        heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))

        if len(self.min_heap) > len(self.max_heap):
            heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))

    def find_median(self) -> float:
        if len(self.max_heap) > len(self.min_heap):
            return -self.max_heap[0]
        return (-self.max_heap[0] + self.min_heap[0]) / 2

Complexity: O(logn) per insertion, O(1) for median query.

Priority Queue Applications

Task Scheduling

Process tasks by priority, not arrival order.

Python:

python
import heapq
from dataclasses import dataclass, field

@dataclass(order=True)
class Task:
    priority: int
    name: str = field(compare=False)

class TaskScheduler:
    def __init__(self):
        self.queue: list[Task] = []

    def add_task(self, name: str, priority: int) -> None:
        heapq.heappush(self.queue, Task(priority, name))

    def get_next(self) -> str | None:
        if self.queue:
            return heapq.heappop(self.queue).name
        return None

Merge K Sorted Streams

Merge K sorted iterators into a single sorted stream — used in database merge operations, log aggregation, and external sorting.

Python:

python
import heapq
from typing import Iterator

def merge_k_sorted(streams: list[Iterator[int]]) -> Iterator[int]:
    heap = []
    for i, stream in enumerate(streams):
        try:
            val = next(stream)
            heapq.heappush(heap, (val, i))
        except StopIteration:
            pass

    while heap:
        val, i = heapq.heappop(heap)
        yield val
        try:
            next_val = next(streams[i])
            heapq.heappush(heap, (next_val, i))
        except StopIteration:
            pass

Production Use Case

This pattern is how Kafka consumers merge partitions, how databases merge sorted runs during compaction, and how log aggregation systems combine logs from multiple sources. The O(NlogK) complexity makes it efficient even with hundreds of streams.

Dijkstra's Algorithm

The priority queue is the heart of Dijkstra. See Graphs: Dijkstra for the full implementation.

Heap vs Other Data Structures

NeedUseWhy
Get min/max quicklyHeapO(1) peek
Get min/max AND searchBalanced BSTO(logn) for all operations
Get min/max AND rank queriesBalanced BST or Segment TreeHeap doesn't support rank
Sort partially (Top-K)Heap of size KO(nlogk) vs O(nlogn)
FIFO with priorityPriority queue (heap)Natural fit
Median maintenanceTwo heapsO(logn) insert, O(1) query

Common Mistake

A heap does not store elements in sorted order. The heap property only guarantees the root is the min/max. The internal order is not fully sorted — iterating over the heap array does not give sorted output. To extract all elements in sorted order, you must pop n times: O(nlogn).

Advanced: Indexed Priority Queue

An indexed priority queue allows updating the priority of an existing element in O(logn). This is essential for Dijkstra's algorithm with decrease-key operations and for scheduling systems where priorities change.

Python (simplified):

python
class IndexedMinPQ:
    def __init__(self, max_size: int):
        self.max_size = max_size
        self.n = 0
        self.keys: list[float | None] = [None] * max_size
        self.pq = [0] * (max_size + 1)    # binary heap
        self.qp = [-1] * max_size          # inverse: qp[i] = position of i in pq

    def contains(self, i: int) -> bool:
        return self.qp[i] != -1

    def insert(self, i: int, key: float) -> None:
        self.n += 1
        self.qp[i] = self.n
        self.pq[self.n] = i
        self.keys[i] = key
        self._swim(self.n)

    def decrease_key(self, i: int, key: float) -> None:
        self.keys[i] = key
        self._swim(self.qp[i])

    def pop_min(self) -> int:
        min_idx = self.pq[1]
        self._swap(1, self.n)
        self.n -= 1
        self._sink(1)
        self.qp[min_idx] = -1
        self.keys[min_idx] = None
        return min_idx

    def _swim(self, k: int) -> None:
        while k > 1 and self.keys[self.pq[k // 2]] > self.keys[self.pq[k]]:
            self._swap(k, k // 2)
            k //= 2

    def _sink(self, k: int) -> None:
        while 2 * k <= self.n:
            j = 2 * k
            if j < self.n and self.keys[self.pq[j + 1]] < self.keys[self.pq[j]]:
                j += 1
            if self.keys[self.pq[k]] <= self.keys[self.pq[j]]:
                break
            self._swap(k, j)
            k = j

    def _swap(self, i: int, j: int) -> None:
        self.pq[i], self.pq[j] = self.pq[j], self.pq[i]
        self.qp[self.pq[i]] = i
        self.qp[self.pq[j]] = j

Practice Problems

ProblemPatternDifficulty
Kth Largest Element in a StreamMin-heap of size KEasy
Last Stone WeightMax-heap simulationEasy
Kth Largest Element in an ArrayMin-heap of size K or Quick SelectMedium
Top K Frequent ElementsHeap + frequency mapMedium
Find Median from Data StreamTwo heapsHard
Merge K Sorted ListsMin-heap + linked listHard
Sliding Window MedianTwo heaps + lazy deletionHard
Reorganize StringMax-heap + greedyMedium
Task SchedulerMax-heap + cooldownMedium
Smallest Range Covering Elements from K ListsMin-heapHard

Further Reading

Try It Yourself

Problem 1: Given the array [3, 1, 6, 5, 2, 4], build a min-heap. What does the resulting array look like?

Solution

Use bottom-up heapification starting from the last non-leaf node:

  • Start: [3, 1, 6, 5, 2, 4]
  • Heapify index 2 (value 6): children are 4. Swap 6 and 4 → [3, 1, 4, 5, 2, 6]
  • Heapify index 1 (value 1): children are 5, 2. 1 < both, no swap.
  • Heapify index 0 (value 3): children are 1, 4. Swap 3 and 1 → [1, 3, 4, 5, 2, 6]. Heapify index 1: children are 5, 2. Swap 3 and 2 → [1, 2, 4, 5, 3, 6]. Result: [1, 2, 4, 5, 3, 6]

Problem 2: Find the 3rd largest element in [7, 10, 4, 3, 20, 15] using a min-heap of size 3.

Solution

Process elements maintaining a min-heap of size 3:

  • 7 → heap=[7]
  • 10 → heap=[7, 10]
  • 4 → heap=[4, 10, 7]
  • 3 → 3 < heap[0]=4, skip (heap full, 3 is smaller than min)
  • 20 → push, pop min(4) → heap=[7, 10, 20]
  • 15 → push, pop min(7) → heap=[10, 20, 15] Answer: heap[0] = 10 (3rd largest)

Problem 3: Using the two-heap technique, find the median after inserting values [5, 15, 1, 3] one at a time.

Solution

Maintain max-heap (lower half) and min-heap (upper half):

  • Insert 5: maxHeap=[-5], minHeap=[] → median = 5
  • Insert 15: maxHeap=[-5], minHeap=[15] → median = (5+15)/2 = 10
  • Insert 1: maxHeap=[-5, -1], minHeap=[15] → rebalance: maxHeap=[-5], push 5 to min, then rebalance → maxHeap=[-5, -1], minHeap=[15] → median = 5
  • Insert 3: maxHeap=[-5, -3, -1], minHeap=[15] → rebalance → maxHeap=[-5, -3], minHeap=[5, 15] → wait, let me redo carefully. After all 4 elements [5, 15, 1, 3]: sorted = [1, 3, 5, 15], median = (3+5)/2 = 4

Problem 4: Merge three sorted arrays [1, 4, 7], [2, 5, 8], [3, 6, 9] using a min-heap.

Solution

Initialize heap with first element of each array: [(1,0,0), (2,1,0), (3,2,0)] (value, array_index, element_index).

  • Pop 1, push 4 from array 0. Result: [1]
  • Pop 2, push 5 from array 1. Result: [1, 2]
  • Pop 3, push 6 from array 2. Result: [1, 2, 3]
  • Pop 4, push 7 from array 0. Result: [1, 2, 3, 4]
  • Continue until all elements are extracted. Result: [1, 2, 3, 4, 5, 6, 7, 8, 9]. Time: O(NlogK) where N=9, K=3.

Problem 5: Why is building a heap from an array O(n) instead of O(nlogn)?

Solution

When building bottom-up, most nodes are near the bottom and need very few swaps:

  • n/2 leaf nodes: 0 swaps each
  • n/4 nodes at height 1: at most 1 swap each
  • n/8 nodes at height 2: at most 2 swaps each
  • Total work: h=0lognn2h+1h=O(n) The series converges because higher nodes (which require more work) are exponentially fewer in number.

Quick Quiz

1. What is the time complexity of extracting the minimum element from a min-heap?

  • a) O(1)
  • b) O(logn)
  • c) O(n)
  • d) O(nlogn)
Answer

b) O(logn) — The minimum is at the root (O(1) to access), but after removing it, the last element is moved to the root and must bubble down through at most O(logn) levels to restore the heap property.

2. For finding the K-th largest element, what type and size of heap should you use?

  • a) Max-heap of size n
  • b) Min-heap of size K
  • c) Max-heap of size K
  • d) Min-heap of size n
Answer

b) Min-heap of size K — Maintain a min-heap of the K largest elements seen so far. When the heap exceeds size K, remove the minimum. After processing all elements, the root is the K-th largest.

3. In the two-heap median finding technique, what type of heap stores the lower half?

  • a) Min-heap
  • b) Max-heap
  • c) Either works
  • d) A balanced BST
Answer

b) Max-heap — The lower half uses a max-heap so the maximum of the lower half (the median candidate) is at the root in O(1). The upper half uses a min-heap so the minimum of the upper half is also at the root.

4. Does a heap store elements in fully sorted order?

  • a) Yes, always
  • b) Only if it is a max-heap
  • c) No -- it only guarantees the root is the min/max
  • d) Only at the leaf level
Answer

c) No -- it only guarantees the root is the min/max — The heap property only ensures parent-child ordering, not sibling ordering. To get all elements in sorted order, you must extract them one by one (O(nlogn) total).

5. What is Python's heapq module -- a min-heap or max-heap?

  • a) Max-heap
  • b) Min-heap
  • c) Both
  • d) Neither
Answer

b) Min-heap — Python's heapq implements a min-heap only. To simulate a max-heap, negate all values before inserting and negate again when extracting.

"What I cannot create, I do not understand." — Richard Feynman