Heaps & Priority Queues
A heap is a complete binary tree that satisfies the heap property: every parent is smaller (min-heap) or larger (max-heap) than its children. A priority queue is the abstract data type; a heap is the most common implementation. Together, they solve a class of problems — Top-K, scheduling, median maintenance, merge K sorted streams — with elegant efficiency.
Heap Fundamentals
Structure
A heap is a complete binary tree stored as an array. For a node at index
- Left child:
- Right child:
- Parent:
Array representation: [1, 3, 2, 7, 6, 5, 4]
Min-Heap vs Max-Heap
| Property | Min-Heap | Max-Heap |
|---|---|---|
| Root | Smallest element | Largest element |
| Parent vs children | Parent | Parent |
| Extract operation | Returns minimum | Returns maximum |
| Use case | Priority queues, Dijkstra | Max priority, heap sort |
Operations Complexity
| Operation | Time |
|---|---|
| Insert (push) | |
| Extract min/max (pop) | |
| Peek (get min/max) | |
| Build heap from array | |
| Search for arbitrary element | |
| Delete arbitrary element |
Min-Heap Implementation
TypeScript:
class MinHeap {
private heap: number[] = [];
get size(): number {
return this.heap.length;
}
peek(): number | undefined {
return this.heap[0];
}
push(val: number): void {
this.heap.push(val);
this.bubbleUp(this.heap.length - 1);
}
pop(): number | undefined {
if (this.heap.length === 0) return undefined;
const min = this.heap[0];
const last = this.heap.pop()!;
if (this.heap.length > 0) {
this.heap[0] = last;
this.bubbleDown(0);
}
return min;
}
private bubbleUp(idx: number): void {
while (idx > 0) {
const parent = Math.floor((idx - 1) / 2);
if (this.heap[parent] <= this.heap[idx]) break;
[this.heap[parent], this.heap[idx]] = [this.heap[idx], this.heap[parent]];
idx = parent;
}
}
private bubbleDown(idx: number): void {
const n = this.heap.length;
while (true) {
let smallest = idx;
const left = 2 * idx + 1;
const right = 2 * idx + 2;
if (left < n && this.heap[left] < this.heap[smallest]) smallest = left;
if (right < n && this.heap[right] < this.heap[smallest]) smallest = right;
if (smallest === idx) break;
[this.heap[idx], this.heap[smallest]] = [this.heap[smallest], this.heap[idx]];
idx = smallest;
}
}
}Python:
class MinHeap:
def __init__(self):
self.heap: list[int] = []
def __len__(self) -> int:
return len(self.heap)
def peek(self) -> int:
return self.heap[0]
def push(self, val: int) -> None:
self.heap.append(val)
self._bubble_up(len(self.heap) - 1)
def pop(self) -> int:
if len(self.heap) == 1:
return self.heap.pop()
min_val = self.heap[0]
self.heap[0] = self.heap.pop()
self._bubble_down(0)
return min_val
def _bubble_up(self, idx: int) -> None:
while idx > 0:
parent = (idx - 1) // 2
if self.heap[parent] <= self.heap[idx]:
break
self.heap[parent], self.heap[idx] = self.heap[idx], self.heap[parent]
idx = parent
def _bubble_down(self, idx: int) -> None:
n = len(self.heap)
while True:
smallest = idx
left = 2 * idx + 1
right = 2 * idx + 2
if left < n and self.heap[left] < self.heap[smallest]:
smallest = left
if right < n and self.heap[right] < self.heap[smallest]:
smallest = right
if smallest == idx:
break
self.heap[idx], self.heap[smallest] = self.heap[smallest], self.heap[idx]
idx = smallestPython's heapq
In practice, use Python's built-in heapq module — it's a min-heap implemented in C, much faster than a pure Python implementation.
import heapq
heap = []
heapq.heappush(heap, 5)
heapq.heappush(heap, 2)
heapq.heappush(heap, 8)
smallest = heapq.heappop(heap) # 2
# For a max-heap, negate values:
heapq.heappush(heap, -10) # pushes 10 as max
largest = -heapq.heappop(heap) # 10Build Heap — Why ?
Building a heap by inserting elements one by one is
Intuition: Most nodes are near the bottom of the tree and only need to bubble down a few levels. The sum of work across all levels is:
Python:
def build_heap(arr: list[int]) -> None:
"""In-place heapify. O(n) time."""
n = len(arr)
# Start from last non-leaf node
for i in range(n // 2 - 1, -1, -1):
heapify_down(arr, n, i)
def heapify_down(arr: list[int], size: int, idx: int) -> None:
smallest = idx
left = 2 * idx + 1
right = 2 * idx + 2
if left < size and arr[left] < arr[smallest]:
smallest = left
if right < size and arr[right] < arr[smallest]:
smallest = right
if smallest != idx:
arr[idx], arr[smallest] = arr[smallest], arr[idx]
heapify_down(arr, size, smallest)Top-K Problems
Kth Largest Element
Use a min-heap of size K. After processing all elements, the root is the Kth largest.
TypeScript:
function findKthLargest(nums: number[], k: number): number {
const heap = new MinHeap();
for (const num of nums) {
heap.push(num);
if (heap.size > k) {
heap.pop(); // remove smallest — keeps K largest
}
}
return heap.peek()!;
}Python:
import heapq
def find_kth_largest(nums: list[int], k: int) -> int:
# nlargest is optimized for this exact use case
return heapq.nlargest(k, nums)[-1]
# Or manually with a min-heap of size k:
# heap = []
# for num in nums:
# heapq.heappush(heap, num)
# if len(heap) > k:
# heapq.heappop(heap)
# return heap[0]Complexity:
Top-K Frequent Elements
Python:
from collections import Counter
import heapq
def top_k_frequent(nums: list[int], k: int) -> list[int]:
count = Counter(nums)
# Use a min-heap of size k based on frequency
return heapq.nlargest(k, count.keys(), key=count.get)TypeScript:
function topKFrequent(nums: number[], k: number): number[] {
const freq = new Map<number, number>();
for (const num of nums) {
freq.set(num, (freq.get(num) || 0) + 1);
}
// Bucket sort approach — O(n)
const buckets: number[][] = Array.from({ length: nums.length + 1 }, () => []);
for (const [num, count] of freq) {
buckets[count].push(num);
}
const result: number[] = [];
for (let i = buckets.length - 1; i >= 0 && result.length < k; i--) {
result.push(...buckets[i]);
}
return result.slice(0, k);
}Median Finding (Two Heaps)
Maintain a max-heap for the lower half and a min-heap for the upper half. The median is always at the top of one or both heaps.
TypeScript:
class MedianFinder {
private maxHeap: number[] = []; // lower half (negate for max behavior)
private minHeap: number[] = []; // upper half
addNum(num: number): void {
// Always add to maxHeap first (store as negated)
this.heapPush(this.maxHeap, -num);
// Balance: max of lower half should be <= min of upper half
this.heapPush(this.minHeap, -this.heapPop(this.maxHeap)!);
// Keep sizes balanced (maxHeap can be 1 larger)
if (this.minHeap.length > this.maxHeap.length) {
this.heapPush(this.maxHeap, -this.heapPop(this.minHeap)!);
}
}
findMedian(): number {
if (this.maxHeap.length > this.minHeap.length) {
return -this.maxHeap[0];
}
return (-this.maxHeap[0] + this.minHeap[0]) / 2;
}
private heapPush(heap: number[], val: number): void {
heap.push(val);
let i = heap.length - 1;
while (i > 0) {
const parent = Math.floor((i - 1) / 2);
if (heap[parent] <= heap[i]) break;
[heap[parent], heap[i]] = [heap[i], heap[parent]];
i = parent;
}
}
private heapPop(heap: number[]): number | undefined {
if (heap.length === 0) return undefined;
const top = heap[0];
const last = heap.pop()!;
if (heap.length > 0) {
heap[0] = last;
let i = 0;
while (true) {
let smallest = i;
const l = 2 * i + 1, r = 2 * i + 2;
if (l < heap.length && heap[l] < heap[smallest]) smallest = l;
if (r < heap.length && heap[r] < heap[smallest]) smallest = r;
if (smallest === i) break;
[heap[i], heap[smallest]] = [heap[smallest], heap[i]];
i = smallest;
}
}
return top;
}
}Python:
import heapq
class MedianFinder:
def __init__(self):
self.max_heap: list[int] = [] # lower half (negated)
self.min_heap: list[int] = [] # upper half
def add_num(self, num: int) -> None:
heapq.heappush(self.max_heap, -num)
heapq.heappush(self.min_heap, -heapq.heappop(self.max_heap))
if len(self.min_heap) > len(self.max_heap):
heapq.heappush(self.max_heap, -heapq.heappop(self.min_heap))
def find_median(self) -> float:
if len(self.max_heap) > len(self.min_heap):
return -self.max_heap[0]
return (-self.max_heap[0] + self.min_heap[0]) / 2Complexity:
Priority Queue Applications
Task Scheduling
Process tasks by priority, not arrival order.
Python:
import heapq
from dataclasses import dataclass, field
@dataclass(order=True)
class Task:
priority: int
name: str = field(compare=False)
class TaskScheduler:
def __init__(self):
self.queue: list[Task] = []
def add_task(self, name: str, priority: int) -> None:
heapq.heappush(self.queue, Task(priority, name))
def get_next(self) -> str | None:
if self.queue:
return heapq.heappop(self.queue).name
return NoneMerge K Sorted Streams
Merge K sorted iterators into a single sorted stream — used in database merge operations, log aggregation, and external sorting.
Python:
import heapq
from typing import Iterator
def merge_k_sorted(streams: list[Iterator[int]]) -> Iterator[int]:
heap = []
for i, stream in enumerate(streams):
try:
val = next(stream)
heapq.heappush(heap, (val, i))
except StopIteration:
pass
while heap:
val, i = heapq.heappop(heap)
yield val
try:
next_val = next(streams[i])
heapq.heappush(heap, (next_val, i))
except StopIteration:
passProduction Use Case
This pattern is how Kafka consumers merge partitions, how databases merge sorted runs during compaction, and how log aggregation systems combine logs from multiple sources. The
Dijkstra's Algorithm
The priority queue is the heart of Dijkstra. See Graphs: Dijkstra for the full implementation.
Heap vs Other Data Structures
| Need | Use | Why |
|---|---|---|
| Get min/max quickly | Heap | |
| Get min/max AND search | Balanced BST | |
| Get min/max AND rank queries | Balanced BST or Segment Tree | Heap doesn't support rank |
| Sort partially (Top-K) | Heap of size K | |
| FIFO with priority | Priority queue (heap) | Natural fit |
| Median maintenance | Two heaps |
Common Mistake
A heap does not store elements in sorted order. The heap property only guarantees the root is the min/max. The internal order is not fully sorted — iterating over the heap array does not give sorted output. To extract all elements in sorted order, you must pop
Advanced: Indexed Priority Queue
An indexed priority queue allows updating the priority of an existing element in
Python (simplified):
class IndexedMinPQ:
def __init__(self, max_size: int):
self.max_size = max_size
self.n = 0
self.keys: list[float | None] = [None] * max_size
self.pq = [0] * (max_size + 1) # binary heap
self.qp = [-1] * max_size # inverse: qp[i] = position of i in pq
def contains(self, i: int) -> bool:
return self.qp[i] != -1
def insert(self, i: int, key: float) -> None:
self.n += 1
self.qp[i] = self.n
self.pq[self.n] = i
self.keys[i] = key
self._swim(self.n)
def decrease_key(self, i: int, key: float) -> None:
self.keys[i] = key
self._swim(self.qp[i])
def pop_min(self) -> int:
min_idx = self.pq[1]
self._swap(1, self.n)
self.n -= 1
self._sink(1)
self.qp[min_idx] = -1
self.keys[min_idx] = None
return min_idx
def _swim(self, k: int) -> None:
while k > 1 and self.keys[self.pq[k // 2]] > self.keys[self.pq[k]]:
self._swap(k, k // 2)
k //= 2
def _sink(self, k: int) -> None:
while 2 * k <= self.n:
j = 2 * k
if j < self.n and self.keys[self.pq[j + 1]] < self.keys[self.pq[j]]:
j += 1
if self.keys[self.pq[k]] <= self.keys[self.pq[j]]:
break
self._swap(k, j)
k = j
def _swap(self, i: int, j: int) -> None:
self.pq[i], self.pq[j] = self.pq[j], self.pq[i]
self.qp[self.pq[i]] = i
self.qp[self.pq[j]] = jPractice Problems
| Problem | Pattern | Difficulty |
|---|---|---|
| Kth Largest Element in a Stream | Min-heap of size K | Easy |
| Last Stone Weight | Max-heap simulation | Easy |
| Kth Largest Element in an Array | Min-heap of size K or Quick Select | Medium |
| Top K Frequent Elements | Heap + frequency map | Medium |
| Find Median from Data Stream | Two heaps | Hard |
| Merge K Sorted Lists | Min-heap + linked list | Hard |
| Sliding Window Median | Two heaps + lazy deletion | Hard |
| Reorganize String | Max-heap + greedy | Medium |
| Task Scheduler | Max-heap + cooldown | Medium |
| Smallest Range Covering Elements from K Lists | Min-heap | Hard |
Further Reading
- Sorting & Searching — heap sort algorithm
- Graphs — Dijkstra uses a priority queue
- Linked Lists — merge K sorted lists
- Trees — heaps are complete binary trees
Try It Yourself
Problem 1: Given the array [3, 1, 6, 5, 2, 4], build a min-heap. What does the resulting array look like?
Solution
Use bottom-up heapification starting from the last non-leaf node:
- Start: [3, 1, 6, 5, 2, 4]
- Heapify index 2 (value 6): children are 4. Swap 6 and 4 → [3, 1, 4, 5, 2, 6]
- Heapify index 1 (value 1): children are 5, 2. 1 < both, no swap.
- Heapify index 0 (value 3): children are 1, 4. Swap 3 and 1 → [1, 3, 4, 5, 2, 6]. Heapify index 1: children are 5, 2. Swap 3 and 2 → [1, 2, 4, 5, 3, 6]. Result: [1, 2, 4, 5, 3, 6]
Problem 2: Find the 3rd largest element in [7, 10, 4, 3, 20, 15] using a min-heap of size 3.
Solution
Process elements maintaining a min-heap of size 3:
- 7 → heap=[7]
- 10 → heap=[7, 10]
- 4 → heap=[4, 10, 7]
- 3 → 3 < heap[0]=4, skip (heap full, 3 is smaller than min)
- 20 → push, pop min(4) → heap=[7, 10, 20]
- 15 → push, pop min(7) → heap=[10, 20, 15] Answer: heap[0] = 10 (3rd largest)
Problem 3: Using the two-heap technique, find the median after inserting values [5, 15, 1, 3] one at a time.
Solution
Maintain max-heap (lower half) and min-heap (upper half):
- Insert 5: maxHeap=[-5], minHeap=[] → median = 5
- Insert 15: maxHeap=[-5], minHeap=[15] → median = (5+15)/2 = 10
- Insert 1: maxHeap=[-5, -1], minHeap=[15] → rebalance: maxHeap=[-5], push 5 to min, then rebalance → maxHeap=[-5, -1], minHeap=[15] → median = 5
- Insert 3: maxHeap=[-5, -3, -1], minHeap=[15] → rebalance → maxHeap=[-5, -3], minHeap=[5, 15] → wait, let me redo carefully. After all 4 elements [5, 15, 1, 3]: sorted = [1, 3, 5, 15], median = (3+5)/2 = 4
Problem 4: Merge three sorted arrays [1, 4, 7], [2, 5, 8], [3, 6, 9] using a min-heap.
Solution
Initialize heap with first element of each array: [(1,0,0), (2,1,0), (3,2,0)] (value, array_index, element_index).
- Pop 1, push 4 from array 0. Result: [1]
- Pop 2, push 5 from array 1. Result: [1, 2]
- Pop 3, push 6 from array 2. Result: [1, 2, 3]
- Pop 4, push 7 from array 0. Result: [1, 2, 3, 4]
- Continue until all elements are extracted. Result: [1, 2, 3, 4, 5, 6, 7, 8, 9]. Time:
where N=9, K=3.
Problem 5: Why is building a heap from an array
Solution
When building bottom-up, most nodes are near the bottom and need very few swaps:
leaf nodes: 0 swaps each nodes at height 1: at most 1 swap each nodes at height 2: at most 2 swaps each - Total work:
The series converges because higher nodes (which require more work) are exponentially fewer in number.
Quick Quiz
1. What is the time complexity of extracting the minimum element from a min-heap?
- a)
- b)
- c)
- d)
Answer
b)
2. For finding the K-th largest element, what type and size of heap should you use?
- a) Max-heap of size
- b) Min-heap of size
- c) Max-heap of size
- d) Min-heap of size
Answer
b) Min-heap of size
3. In the two-heap median finding technique, what type of heap stores the lower half?
- a) Min-heap
- b) Max-heap
- c) Either works
- d) A balanced BST
Answer
b) Max-heap — The lower half uses a max-heap so the maximum of the lower half (the median candidate) is at the root in
4. Does a heap store elements in fully sorted order?
- a) Yes, always
- b) Only if it is a max-heap
- c) No -- it only guarantees the root is the min/max
- d) Only at the leaf level
Answer
c) No -- it only guarantees the root is the min/max — The heap property only ensures parent-child ordering, not sibling ordering. To get all elements in sorted order, you must extract them one by one (
5. What is Python's heapq module -- a min-heap or max-heap?
- a) Max-heap
- b) Min-heap
- c) Both
- d) Neither
Answer
b) Min-heap — Python's heapq implements a min-heap only. To simulate a max-heap, negate all values before inserting and negate again when extracting.