Sorting & Searching
Sorting and searching are the workhorses of computer science. Every database query involves sorting or searching. Every recommendation engine ranks (sorts) results. Every time you call Array.prototype.sort() or list.sort(), you are running a carefully engineered sorting algorithm. Understanding these algorithms is not about reimplementing them — it is about knowing their trade-offs so you can choose the right tool, and about mastering binary search, which is arguably the most versatile technique in all of algorithms.
Sorting Algorithms Overview
| Algorithm | Best | Average | Worst | Space | Stable? | In-Place? |
|---|---|---|---|---|---|---|
| Bubble Sort | Yes | Yes | ||||
| Selection Sort | No | Yes | ||||
| Insertion Sort | Yes | Yes | ||||
| Merge Sort | Yes | No | ||||
| Quick Sort | No | Yes | ||||
| Heap Sort | No | Yes | ||||
| Counting Sort | Yes | No | ||||
| Radix Sort | Yes | No |
When to Use Which
- Nearly sorted data: Insertion sort (
best case) - General purpose: Merge sort (guaranteed
, stable) - In-place, general purpose: Quicksort (fastest in practice for random data)
- Integer data with known range: Counting sort or radix sort (
) - External sorting (data doesn't fit in memory): Merge sort (sequential access pattern)
- Priority queue operations: Heap sort or partial sorting with a heap
Merge Sort
Divide the array in half, recursively sort each half, then merge the sorted halves. The merge step is the key insight — merging two sorted arrays takes
TypeScript:
function mergeSort(arr: number[]): number[] {
if (arr.length <= 1) return arr;
const mid = Math.floor(arr.length / 2);
const left = mergeSort(arr.slice(0, mid));
const right = mergeSort(arr.slice(mid));
return merge(left, right);
}
function merge(left: number[], right: number[]): number[] {
const result: number[] = [];
let i = 0, j = 0;
while (i < left.length && j < right.length) {
if (left[i] <= right[j]) {
result.push(left[i++]);
} else {
result.push(right[j++]);
}
}
return result.concat(left.slice(i), right.slice(j));
}Python:
def merge_sort(arr: list[int]) -> list[int]:
if len(arr) <= 1:
return arr
mid = len(arr) // 2
left = merge_sort(arr[:mid])
right = merge_sort(arr[mid:])
return merge(left, right)
def merge(left: list[int], right: list[int]) -> list[int]:
result = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] <= right[j]:
result.append(left[i])
i += 1
else:
result.append(right[j])
j += 1
result.extend(left[i:])
result.extend(right[j:])
return resultComplexity:
Why merge sort matters in production: It is the basis of external sorting (sorting data that doesn't fit in RAM). The merge step only needs sequential access, making it efficient for disk I/O. TimSort (Python's list.sort(), Java's Arrays.sort() for objects) is a hybrid of merge sort and insertion sort.
Quick Sort
Choose a pivot, partition the array so everything less than the pivot is on the left and everything greater is on the right, then recursively sort each side.
TypeScript:
function quickSort(arr: number[], lo = 0, hi = arr.length - 1): void {
if (lo >= hi) return;
const pivot = partition(arr, lo, hi);
quickSort(arr, lo, pivot - 1);
quickSort(arr, pivot + 1, hi);
}
function partition(arr: number[], lo: number, hi: number): number {
const pivot = arr[hi]; // last element as pivot
let i = lo - 1;
for (let j = lo; j < hi; j++) {
if (arr[j] <= pivot) {
i++;
[arr[i], arr[j]] = [arr[j], arr[i]];
}
}
[arr[i + 1], arr[hi]] = [arr[hi], arr[i + 1]];
return i + 1;
}Python:
import random
def quick_sort(arr: list[int], lo: int = 0, hi: int | None = None) -> None:
if hi is None:
hi = len(arr) - 1
if lo >= hi:
return
pivot_idx = partition(arr, lo, hi)
quick_sort(arr, lo, pivot_idx - 1)
quick_sort(arr, pivot_idx + 1, hi)
def partition(arr: list[int], lo: int, hi: int) -> int:
# Randomized pivot to avoid worst-case on sorted input
rand_idx = random.randint(lo, hi)
arr[rand_idx], arr[hi] = arr[hi], arr[rand_idx]
pivot = arr[hi]
i = lo - 1
for j in range(lo, hi):
if arr[j] <= pivot:
i += 1
arr[i], arr[j] = arr[j], arr[i]
arr[i + 1], arr[hi] = arr[hi], arr[i + 1]
return i + 1Complexity:
WARNING
Quicksort's worst case occurs on already-sorted data when the pivot is the first or last element. Always use randomized pivot selection or median-of-three to avoid this in production.
Quick Select (Kth Smallest Element)
A modified quicksort that only recurses into the partition containing the target index. Expected
Python:
def quick_select(arr: list[int], k: int) -> int:
"""Find the kth smallest element (0-indexed)."""
lo, hi = 0, len(arr) - 1
while lo <= hi:
pivot_idx = partition(arr, lo, hi)
if pivot_idx == k:
return arr[k]
elif pivot_idx < k:
lo = pivot_idx + 1
else:
hi = pivot_idx - 1
return arr[lo]Complexity:
Heap Sort
Build a max-heap from the array, then repeatedly extract the maximum element.
TypeScript:
function heapSort(arr: number[]): void {
const n = arr.length;
// Build max-heap
for (let i = Math.floor(n / 2) - 1; i >= 0; i--) {
heapify(arr, n, i);
}
// Extract elements from heap one by one
for (let i = n - 1; i > 0; i--) {
[arr[0], arr[i]] = [arr[i], arr[0]]; // move max to end
heapify(arr, i, 0);
}
}
function heapify(arr: number[], heapSize: number, rootIdx: number): void {
let largest = rootIdx;
const left = 2 * rootIdx + 1;
const right = 2 * rootIdx + 2;
if (left < heapSize && arr[left] > arr[largest]) largest = left;
if (right < heapSize && arr[right] > arr[largest]) largest = right;
if (largest !== rootIdx) {
[arr[rootIdx], arr[largest]] = [arr[largest], arr[rootIdx]];
heapify(arr, heapSize, largest);
}
}Python:
def heap_sort(arr: list[int]) -> None:
n = len(arr)
for i in range(n // 2 - 1, -1, -1):
heapify(arr, n, i)
for i in range(n - 1, 0, -1):
arr[0], arr[i] = arr[i], arr[0]
heapify(arr, i, 0)
def heapify(arr: list[int], heap_size: int, root: int) -> None:
largest = root
left = 2 * root + 1
right = 2 * root + 2
if left < heap_size and arr[left] > arr[largest]:
largest = left
if right < heap_size and arr[right] > arr[largest]:
largest = right
if largest != root:
arr[root], arr[largest] = arr[largest], arr[root]
heapify(arr, heap_size, largest)Complexity:
Non-Comparison Sorts
These break the
Counting Sort
For integers in a known range
Python:
def counting_sort(arr: list[int]) -> list[int]:
if not arr:
return []
max_val = max(arr)
count = [0] * (max_val + 1)
for num in arr:
count[num] += 1
result = []
for val, cnt in enumerate(count):
result.extend([val] * cnt)
return resultComplexity:
Radix Sort
Sort by each digit position, from least significant to most significant, using a stable sort (like counting sort) at each level.
Python:
def radix_sort(arr: list[int]) -> list[int]:
if not arr:
return []
max_val = max(arr)
exp = 1
while max_val // exp > 0:
arr = counting_sort_by_digit(arr, exp)
exp *= 10
return arr
def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
n = len(arr)
output = [0] * n
count = [0] * 10
for num in arr:
digit = (num // exp) % 10
count[digit] += 1
for i in range(1, 10):
count[i] += count[i - 1]
for i in range(n - 1, -1, -1):
digit = (arr[i] // exp) % 10
output[count[digit] - 1] = arr[i]
count[digit] -= 1
return outputComplexity:
Binary Search
Binary search is the most important searching technique. It works on sorted data and eliminates half the search space on each step.
Standard Binary Search
TypeScript:
function binarySearch(nums: number[], target: number): number {
let lo = 0;
let hi = nums.length - 1;
while (lo <= hi) {
const mid = lo + Math.floor((hi - lo) / 2); // avoid overflow
if (nums[mid] === target) return mid;
if (nums[mid] < target) lo = mid + 1;
else hi = mid - 1;
}
return -1; // not found
}Python:
def binary_search(nums: list[int], target: int) -> int:
lo, hi = 0, len(nums) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2
if nums[mid] == target:
return mid
elif nums[mid] < target:
lo = mid + 1
else:
hi = mid - 1
return -1Complexity:
DANGER
Use lo + (hi - lo) // 2 instead of (lo + hi) // 2 to prevent integer overflow in languages with fixed-size integers (C, C++, Java). In Python and TypeScript this isn't strictly necessary, but it's a good habit.
Binary Search Variations
These variations are where binary search becomes powerful. The key insight: binary search doesn't just find a target — it finds a boundary.
Find Leftmost (First Occurrence)
TypeScript:
function findFirst(nums: number[], target: number): number {
let lo = 0, hi = nums.length - 1;
let result = -1;
while (lo <= hi) {
const mid = lo + Math.floor((hi - lo) / 2);
if (nums[mid] === target) {
result = mid;
hi = mid - 1; // keep searching left
} else if (nums[mid] < target) {
lo = mid + 1;
} else {
hi = mid - 1;
}
}
return result;
}Find Rightmost (Last Occurrence)
TypeScript:
function findLast(nums: number[], target: number): number {
let lo = 0, hi = nums.length - 1;
let result = -1;
while (lo <= hi) {
const mid = lo + Math.floor((hi - lo) / 2);
if (nums[mid] === target) {
result = mid;
lo = mid + 1; // keep searching right
} else if (nums[mid] < target) {
lo = mid + 1;
} else {
hi = mid - 1;
}
}
return result;
}Search in Rotated Sorted Array
Python:
def search_rotated(nums: list[int], target: int) -> int:
lo, hi = 0, len(nums) - 1
while lo <= hi:
mid = lo + (hi - lo) // 2
if nums[mid] == target:
return mid
# Determine which half is sorted
if nums[lo] <= nums[mid]: # left half is sorted
if nums[lo] <= target < nums[mid]:
hi = mid - 1
else:
lo = mid + 1
else: # right half is sorted
if nums[mid] < target <= nums[hi]:
lo = mid + 1
else:
hi = mid - 1
return -1Binary Search on Answer
When you can frame a problem as "what is the minimum/maximum value such that a condition is true?" — binary search on that value.
Example: Minimum Capacity to Ship Packages Within D Days
Python:
def ship_within_days(weights: list[int], days: int) -> int:
def can_ship(capacity: int) -> bool:
current_load = 0
needed_days = 1
for w in weights:
if current_load + w > capacity:
needed_days += 1
current_load = 0
current_load += w
return needed_days <= days
lo = max(weights) # min possible capacity
hi = sum(weights) # max possible capacity
while lo < hi:
mid = lo + (hi - lo) // 2
if can_ship(mid):
hi = mid
else:
lo = mid + 1
return loBinary Search on Answer
This pattern appears constantly in interviews and competitive programming. The framework:
- Define a predicate function
condition(x) -> bool - Find the boundary where the predicate flips from False to True (or vice versa)
- Set
loandhito the valid range - Binary search for the boundary
If condition(mid) is True and you want the minimum valid value: hi = mid If condition(mid) is False and you want the minimum valid value: lo = mid + 1
Sorting in Practice
Language Built-in Sorts
| Language | Algorithm | Stable? |
|---|---|---|
Python list.sort() | TimSort (merge + insertion) | Yes |
JavaScript Array.sort() | TimSort (V8), varies by engine | Depends on engine |
Java Arrays.sort() (primitives) | Dual-pivot quicksort | No |
Java Arrays.sort() (objects) | TimSort | Yes |
C++ std::sort() | Introsort (quick + heap + insertion) | No |
C++ std::stable_sort() | Merge sort | Yes |
Custom Comparators
TypeScript:
// Sort objects by multiple keys
const users = [
{ name: "Alice", age: 30 },
{ name: "Bob", age: 25 },
{ name: "Charlie", age: 30 },
];
users.sort((a, b) => {
if (a.age !== b.age) return a.age - b.age; // ascending age
return a.name.localeCompare(b.name); // then alphabetical
});Python:
from functools import cmp_to_key
# Sort by multiple criteria
users = [("Alice", 30), ("Bob", 25), ("Charlie", 30)]
users.sort(key=lambda u: (u[1], u[0])) # (age, name) tuple comparison
# Custom comparator for complex ordering
def compare_versions(v1: str, v2: str) -> int:
parts1 = list(map(int, v1.split(".")))
parts2 = list(map(int, v2.split(".")))
for p1, p2 in zip(parts1, parts2):
if p1 != p2:
return p1 - p2
return len(parts1) - len(parts2)
versions = ["1.2.3", "1.0.1", "1.2.1"]
versions.sort(key=cmp_to_key(compare_versions))The Lower Bound
Any comparison-based sorting algorithm must make at least
This is why counting sort and radix sort are faster — they bypass comparisons by using the structure of the data itself.
Practice Problems
| Problem | Technique | Difficulty |
|---|---|---|
| Merge Sorted Array | Merge from back | Easy |
| Sort Colors (Dutch National Flag) | Three-way partition | Medium |
| Kth Largest Element | Quick select / heap | Medium |
| Search in Rotated Sorted Array | Modified binary search | Medium |
| Find Peak Element | Binary search | Medium |
| Search a 2D Matrix | Binary search | Medium |
| Median of Two Sorted Arrays | Binary search | Hard |
| Count of Smaller Numbers After Self | Merge sort + count | Hard |
| Find Minimum in Rotated Sorted Array | Binary search | Medium |
| Capacity To Ship Packages | Binary search on answer | Medium |
Further Reading
- Heaps & Priority Queues — heap sort and partial sorting
- Arrays & Strings — two pointers on sorted arrays
- Dynamic Programming — LIS uses binary search optimization
- Database Query Planning — how databases choose sort algorithms
Try It Yourself
Problem 1: Sort the array [5, 3, 8, 4, 2] using merge sort. Show the divide and merge steps.
Solution
Divide:
- [5, 3, 8, 4, 2] → [5, 3] and [8, 4, 2]
- [5, 3] → [5] and [3]
- [8, 4, 2] → [8] and [4, 2] → [4] and [2]
Merge:
- Merge [5] and [3] → [3, 5]
- Merge [4] and [2] → [2, 4]
- Merge [8] and [2, 4] → [2, 4, 8]
- Merge [3, 5] and [2, 4, 8] → [2, 3, 4, 5, 8]
Problem 2: Use binary search to find target 7 in [1, 3, 5, 7, 9, 11, 13].
Solution
- lo=0, hi=6, mid=3. nums[3]=7 == target. Answer: Found at index 3 in 1 step.
Problem 3: Search for target 5 in the rotated sorted array [4, 5, 6, 7, 0, 1, 2].
Solution
Modified binary search:
- lo=0, hi=6, mid=3. nums[3]=7 != 5.
- Left half [4,5,6,7] is sorted (nums[0]=4 <= nums[3]=7). Is 4 <= 5 < 7? Yes → search left: hi=2.
- lo=0, hi=2, mid=1. nums[1]=5 == target. Answer: Found at index 1.
Problem 4: Find the Kth largest element (K=2) in [3, 2, 1, 5, 6, 4] without fully sorting.
Solution
Use a min-heap of size K=2:
- Process 3: heap=[3]
- Process 2: heap=[2,3]
- Process 1: 1 < heap[0]=2, skip (or push and pop)
- Process 5: push, pop min → heap=[3,5]
- Process 6: push, pop min → heap=[5,6]
- Process 4: 4 < heap[0]=5, skip Answer: heap[0] = 5 (2nd largest element)
Problem 5: Given weights [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] and D=5 days, find the minimum ship capacity to ship all packages within D days using binary search on the answer.
Solution
- lo = max(weights) = 10, hi = sum(weights) = 55
- Binary search: mid=32 → can ship in 2 days (yes), hi=32
- mid=21 → can ship in 3 days (yes), hi=21
- mid=15 → can ship in 5 days (yes), hi=15
- mid=12 → need 6 days > 5 (no), lo=13
- mid=14 → can ship in 5 days (yes), hi=14
- mid=13 → need 6 days (no), lo=14 Answer: Minimum capacity = 15
Quick Quiz
1. What is the best sorting algorithm for nearly-sorted data?
- a) Quick sort
- b) Merge sort
- c) Insertion sort
- d) Heap sort
Answer
c) Insertion sort — Insertion sort has
2. Why does quicksort have
- a) The merge step is inefficient
- b) The pivot selection consistently picks the minimum or maximum element, creating unbalanced partitions
- c) It uses too much memory
- d) It cannot handle duplicate elements
Answer
b) The pivot selection consistently picks the minimum or maximum element, creating unbalanced partitions — When the pivot is always the smallest or largest element (e.g., sorted input with first/last pivot), one partition has
3. What is the theoretical lower bound for comparison-based sorting?
- a)
- b)
- c)
- d)
Answer
b)
4. In binary search, why should you compute mid as lo + (hi - lo) / 2 instead of (lo + hi) / 2?
- a) It is faster to compute
- b) It prevents integer overflow when
lo + hiexceeds the maximum integer value - c) It gives a different result
- d) It works better with floating-point numbers
Answer
b) It prevents integer overflow when lo + hi exceeds the maximum integer value — In languages with fixed-size integers (C, C++, Java), lo + hi can overflow. Subtracting first avoids this. In Python and JavaScript this is not strictly necessary, but it is a good habit.
5. When is counting sort preferred over comparison-based sorts?
- a) When the data is floating-point
- b) When the data consists of integers in a known, small range
- c) When stability is not required
- d) When memory is extremely limited
Answer
b) When the data consists of integers in a known, small range — Counting sort runs in