Skip to content
Unverified — AI-generated content. Help verify this page

Sorting & Searching

Sorting and searching are the workhorses of computer science. Every database query involves sorting or searching. Every recommendation engine ranks (sorts) results. Every time you call Array.prototype.sort() or list.sort(), you are running a carefully engineered sorting algorithm. Understanding these algorithms is not about reimplementing them — it is about knowing their trade-offs so you can choose the right tool, and about mastering binary search, which is arguably the most versatile technique in all of algorithms.

Sorting Algorithms Overview

AlgorithmBestAverageWorstSpaceStable?In-Place?
Bubble SortO(n)O(n2)O(n2)O(1)YesYes
Selection SortO(n2)O(n2)O(n2)O(1)NoYes
Insertion SortO(n)O(n2)O(n2)O(1)YesYes
Merge SortO(nlogn)O(nlogn)O(nlogn)O(n)YesNo
Quick SortO(nlogn)O(nlogn)O(n2)O(logn)NoYes
Heap SortO(nlogn)O(nlogn)O(nlogn)O(1)NoYes
Counting SortO(n+k)O(n+k)O(n+k)O(k)YesNo
Radix SortO(d(n+k))O(d(n+k))O(d(n+k))O(n+k)YesNo

k = range of values, d = number of digits.

When to Use Which

  • Nearly sorted data: Insertion sort (O(n) best case)
  • General purpose: Merge sort (guaranteed O(nlogn), stable)
  • In-place, general purpose: Quicksort (fastest in practice for random data)
  • Integer data with known range: Counting sort or radix sort (O(n))
  • External sorting (data doesn't fit in memory): Merge sort (sequential access pattern)
  • Priority queue operations: Heap sort or partial sorting with a heap

Merge Sort

Divide the array in half, recursively sort each half, then merge the sorted halves. The merge step is the key insight — merging two sorted arrays takes O(n) time.

TypeScript:

typescript
function mergeSort(arr: number[]): number[] {
  if (arr.length <= 1) return arr;

  const mid = Math.floor(arr.length / 2);
  const left = mergeSort(arr.slice(0, mid));
  const right = mergeSort(arr.slice(mid));

  return merge(left, right);
}

function merge(left: number[], right: number[]): number[] {
  const result: number[] = [];
  let i = 0, j = 0;

  while (i < left.length && j < right.length) {
    if (left[i] <= right[j]) {
      result.push(left[i++]);
    } else {
      result.push(right[j++]);
    }
  }

  return result.concat(left.slice(i), right.slice(j));
}

Python:

python
def merge_sort(arr: list[int]) -> list[int]:
    if len(arr) <= 1:
        return arr

    mid = len(arr) // 2
    left = merge_sort(arr[:mid])
    right = merge_sort(arr[mid:])

    return merge(left, right)

def merge(left: list[int], right: list[int]) -> list[int]:
    result = []
    i = j = 0

    while i < len(left) and j < len(right):
        if left[i] <= right[j]:
            result.append(left[i])
            i += 1
        else:
            result.append(right[j])
            j += 1

    result.extend(left[i:])
    result.extend(right[j:])
    return result

Complexity: O(nlogn) time always. O(n) space for the temporary arrays. Stable.

Why merge sort matters in production: It is the basis of external sorting (sorting data that doesn't fit in RAM). The merge step only needs sequential access, making it efficient for disk I/O. TimSort (Python's list.sort(), Java's Arrays.sort() for objects) is a hybrid of merge sort and insertion sort.

Quick Sort

Choose a pivot, partition the array so everything less than the pivot is on the left and everything greater is on the right, then recursively sort each side.

TypeScript:

typescript
function quickSort(arr: number[], lo = 0, hi = arr.length - 1): void {
  if (lo >= hi) return;

  const pivot = partition(arr, lo, hi);
  quickSort(arr, lo, pivot - 1);
  quickSort(arr, pivot + 1, hi);
}

function partition(arr: number[], lo: number, hi: number): number {
  const pivot = arr[hi]; // last element as pivot
  let i = lo - 1;

  for (let j = lo; j < hi; j++) {
    if (arr[j] <= pivot) {
      i++;
      [arr[i], arr[j]] = [arr[j], arr[i]];
    }
  }

  [arr[i + 1], arr[hi]] = [arr[hi], arr[i + 1]];
  return i + 1;
}

Python:

python
import random

def quick_sort(arr: list[int], lo: int = 0, hi: int | None = None) -> None:
    if hi is None:
        hi = len(arr) - 1
    if lo >= hi:
        return

    pivot_idx = partition(arr, lo, hi)
    quick_sort(arr, lo, pivot_idx - 1)
    quick_sort(arr, pivot_idx + 1, hi)

def partition(arr: list[int], lo: int, hi: int) -> int:
    # Randomized pivot to avoid worst-case on sorted input
    rand_idx = random.randint(lo, hi)
    arr[rand_idx], arr[hi] = arr[hi], arr[rand_idx]

    pivot = arr[hi]
    i = lo - 1

    for j in range(lo, hi):
        if arr[j] <= pivot:
            i += 1
            arr[i], arr[j] = arr[j], arr[i]

    arr[i + 1], arr[hi] = arr[hi], arr[i + 1]
    return i + 1

Complexity: O(nlogn) average, O(n2) worst case (when pivot is always min/max). O(logn) stack space. Not stable.

WARNING

Quicksort's worst case occurs on already-sorted data when the pivot is the first or last element. Always use randomized pivot selection or median-of-three to avoid this in production.

Quick Select (Kth Smallest Element)

A modified quicksort that only recurses into the partition containing the target index. Expected O(n) time.

Python:

python
def quick_select(arr: list[int], k: int) -> int:
    """Find the kth smallest element (0-indexed)."""
    lo, hi = 0, len(arr) - 1

    while lo <= hi:
        pivot_idx = partition(arr, lo, hi)
        if pivot_idx == k:
            return arr[k]
        elif pivot_idx < k:
            lo = pivot_idx + 1
        else:
            hi = pivot_idx - 1

    return arr[lo]

Complexity: O(n) expected, O(n2) worst case. Use median-of-medians for guaranteed O(n).

Heap Sort

Build a max-heap from the array, then repeatedly extract the maximum element.

TypeScript:

typescript
function heapSort(arr: number[]): void {
  const n = arr.length;

  // Build max-heap
  for (let i = Math.floor(n / 2) - 1; i >= 0; i--) {
    heapify(arr, n, i);
  }

  // Extract elements from heap one by one
  for (let i = n - 1; i > 0; i--) {
    [arr[0], arr[i]] = [arr[i], arr[0]]; // move max to end
    heapify(arr, i, 0);
  }
}

function heapify(arr: number[], heapSize: number, rootIdx: number): void {
  let largest = rootIdx;
  const left = 2 * rootIdx + 1;
  const right = 2 * rootIdx + 2;

  if (left < heapSize && arr[left] > arr[largest]) largest = left;
  if (right < heapSize && arr[right] > arr[largest]) largest = right;

  if (largest !== rootIdx) {
    [arr[rootIdx], arr[largest]] = [arr[largest], arr[rootIdx]];
    heapify(arr, heapSize, largest);
  }
}

Python:

python
def heap_sort(arr: list[int]) -> None:
    n = len(arr)

    for i in range(n // 2 - 1, -1, -1):
        heapify(arr, n, i)

    for i in range(n - 1, 0, -1):
        arr[0], arr[i] = arr[i], arr[0]
        heapify(arr, i, 0)

def heapify(arr: list[int], heap_size: int, root: int) -> None:
    largest = root
    left = 2 * root + 1
    right = 2 * root + 2

    if left < heap_size and arr[left] > arr[largest]:
        largest = left
    if right < heap_size and arr[right] > arr[largest]:
        largest = right

    if largest != root:
        arr[root], arr[largest] = arr[largest], arr[root]
        heapify(arr, heap_size, largest)

Complexity: O(nlogn) always. O(1) space. Not stable. See Heaps & Priority Queues for more.

Non-Comparison Sorts

These break the Ω(nlogn) lower bound for comparison-based sorting by exploiting properties of the data.

Counting Sort

For integers in a known range [0,k]. Count occurrences, then reconstruct.

Python:

python
def counting_sort(arr: list[int]) -> list[int]:
    if not arr:
        return []

    max_val = max(arr)
    count = [0] * (max_val + 1)

    for num in arr:
        count[num] += 1

    result = []
    for val, cnt in enumerate(count):
        result.extend([val] * cnt)

    return result

Complexity: O(n+k) time and space. Efficient when k=O(n).

Radix Sort

Sort by each digit position, from least significant to most significant, using a stable sort (like counting sort) at each level.

Python:

python
def radix_sort(arr: list[int]) -> list[int]:
    if not arr:
        return []

    max_val = max(arr)
    exp = 1

    while max_val // exp > 0:
        arr = counting_sort_by_digit(arr, exp)
        exp *= 10

    return arr

def counting_sort_by_digit(arr: list[int], exp: int) -> list[int]:
    n = len(arr)
    output = [0] * n
    count = [0] * 10

    for num in arr:
        digit = (num // exp) % 10
        count[digit] += 1

    for i in range(1, 10):
        count[i] += count[i - 1]

    for i in range(n - 1, -1, -1):
        digit = (arr[i] // exp) % 10
        output[count[digit] - 1] = arr[i]
        count[digit] -= 1

    return output

Complexity: O(d(n+k)) where d is the number of digits and k is the base (10). For 32-bit integers: O(n).

Binary search is the most important searching technique. It works on sorted data and eliminates half the search space on each step.

TypeScript:

typescript
function binarySearch(nums: number[], target: number): number {
  let lo = 0;
  let hi = nums.length - 1;

  while (lo <= hi) {
    const mid = lo + Math.floor((hi - lo) / 2); // avoid overflow
    if (nums[mid] === target) return mid;
    if (nums[mid] < target) lo = mid + 1;
    else hi = mid - 1;
  }

  return -1; // not found
}

Python:

python
def binary_search(nums: list[int], target: int) -> int:
    lo, hi = 0, len(nums) - 1

    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if nums[mid] == target:
            return mid
        elif nums[mid] < target:
            lo = mid + 1
        else:
            hi = mid - 1

    return -1

Complexity: O(logn) time, O(1) space.

DANGER

Use lo + (hi - lo) // 2 instead of (lo + hi) // 2 to prevent integer overflow in languages with fixed-size integers (C, C++, Java). In Python and TypeScript this isn't strictly necessary, but it's a good habit.

Binary Search Variations

These variations are where binary search becomes powerful. The key insight: binary search doesn't just find a target — it finds a boundary.

Find Leftmost (First Occurrence)

TypeScript:

typescript
function findFirst(nums: number[], target: number): number {
  let lo = 0, hi = nums.length - 1;
  let result = -1;

  while (lo <= hi) {
    const mid = lo + Math.floor((hi - lo) / 2);
    if (nums[mid] === target) {
      result = mid;
      hi = mid - 1; // keep searching left
    } else if (nums[mid] < target) {
      lo = mid + 1;
    } else {
      hi = mid - 1;
    }
  }

  return result;
}

Find Rightmost (Last Occurrence)

TypeScript:

typescript
function findLast(nums: number[], target: number): number {
  let lo = 0, hi = nums.length - 1;
  let result = -1;

  while (lo <= hi) {
    const mid = lo + Math.floor((hi - lo) / 2);
    if (nums[mid] === target) {
      result = mid;
      lo = mid + 1; // keep searching right
    } else if (nums[mid] < target) {
      lo = mid + 1;
    } else {
      hi = mid - 1;
    }
  }

  return result;
}

Search in Rotated Sorted Array

Python:

python
def search_rotated(nums: list[int], target: int) -> int:
    lo, hi = 0, len(nums) - 1

    while lo <= hi:
        mid = lo + (hi - lo) // 2
        if nums[mid] == target:
            return mid

        # Determine which half is sorted
        if nums[lo] <= nums[mid]:  # left half is sorted
            if nums[lo] <= target < nums[mid]:
                hi = mid - 1
            else:
                lo = mid + 1
        else:  # right half is sorted
            if nums[mid] < target <= nums[hi]:
                lo = mid + 1
            else:
                hi = mid - 1

    return -1

Binary Search on Answer

When you can frame a problem as "what is the minimum/maximum value such that a condition is true?" — binary search on that value.

Example: Minimum Capacity to Ship Packages Within D Days

Python:

python
def ship_within_days(weights: list[int], days: int) -> int:
    def can_ship(capacity: int) -> bool:
        current_load = 0
        needed_days = 1
        for w in weights:
            if current_load + w > capacity:
                needed_days += 1
                current_load = 0
            current_load += w
        return needed_days <= days

    lo = max(weights)       # min possible capacity
    hi = sum(weights)       # max possible capacity

    while lo < hi:
        mid = lo + (hi - lo) // 2
        if can_ship(mid):
            hi = mid
        else:
            lo = mid + 1

    return lo

Binary Search on Answer

This pattern appears constantly in interviews and competitive programming. The framework:

  1. Define a predicate function condition(x) -> bool
  2. Find the boundary where the predicate flips from False to True (or vice versa)
  3. Set lo and hi to the valid range
  4. Binary search for the boundary

If condition(mid) is True and you want the minimum valid value: hi = mid If condition(mid) is False and you want the minimum valid value: lo = mid + 1

Sorting in Practice

Language Built-in Sorts

LanguageAlgorithmStable?
Python list.sort()TimSort (merge + insertion)Yes
JavaScript Array.sort()TimSort (V8), varies by engineDepends on engine
Java Arrays.sort() (primitives)Dual-pivot quicksortNo
Java Arrays.sort() (objects)TimSortYes
C++ std::sort()Introsort (quick + heap + insertion)No
C++ std::stable_sort()Merge sortYes

Custom Comparators

TypeScript:

typescript
// Sort objects by multiple keys
const users = [
  { name: "Alice", age: 30 },
  { name: "Bob", age: 25 },
  { name: "Charlie", age: 30 },
];

users.sort((a, b) => {
  if (a.age !== b.age) return a.age - b.age; // ascending age
  return a.name.localeCompare(b.name);         // then alphabetical
});

Python:

python
from functools import cmp_to_key

# Sort by multiple criteria
users = [("Alice", 30), ("Bob", 25), ("Charlie", 30)]
users.sort(key=lambda u: (u[1], u[0]))  # (age, name) tuple comparison

# Custom comparator for complex ordering
def compare_versions(v1: str, v2: str) -> int:
    parts1 = list(map(int, v1.split(".")))
    parts2 = list(map(int, v2.split(".")))
    for p1, p2 in zip(parts1, parts2):
        if p1 != p2:
            return p1 - p2
    return len(parts1) - len(parts2)

versions = ["1.2.3", "1.0.1", "1.2.1"]
versions.sort(key=cmp_to_key(compare_versions))

The Lower Bound

Any comparison-based sorting algorithm must make at least Ω(nlogn) comparisons in the worst case. The proof uses a decision tree argument: there are n! possible orderings, and each comparison can only halve the remaining possibilities.

Height of decision treelog2(n!)=Θ(nlogn)

This is why counting sort and radix sort are faster — they bypass comparisons by using the structure of the data itself.

Practice Problems

ProblemTechniqueDifficulty
Merge Sorted ArrayMerge from backEasy
Sort Colors (Dutch National Flag)Three-way partitionMedium
Kth Largest ElementQuick select / heapMedium
Search in Rotated Sorted ArrayModified binary searchMedium
Find Peak ElementBinary searchMedium
Search a 2D MatrixBinary searchMedium
Median of Two Sorted ArraysBinary searchHard
Count of Smaller Numbers After SelfMerge sort + countHard
Find Minimum in Rotated Sorted ArrayBinary searchMedium
Capacity To Ship PackagesBinary search on answerMedium

Further Reading

Try It Yourself

Problem 1: Sort the array [5, 3, 8, 4, 2] using merge sort. Show the divide and merge steps.

Solution

Divide:

  • [5, 3, 8, 4, 2] → [5, 3] and [8, 4, 2]
  • [5, 3] → [5] and [3]
  • [8, 4, 2] → [8] and [4, 2] → [4] and [2]

Merge:

  • Merge [5] and [3] → [3, 5]
  • Merge [4] and [2] → [2, 4]
  • Merge [8] and [2, 4] → [2, 4, 8]
  • Merge [3, 5] and [2, 4, 8] → [2, 3, 4, 5, 8]

Problem 2: Use binary search to find target 7 in [1, 3, 5, 7, 9, 11, 13].

Solution
  • lo=0, hi=6, mid=3. nums[3]=7 == target. Answer: Found at index 3 in 1 step.

Problem 3: Search for target 5 in the rotated sorted array [4, 5, 6, 7, 0, 1, 2].

Solution

Modified binary search:

  • lo=0, hi=6, mid=3. nums[3]=7 != 5.
  • Left half [4,5,6,7] is sorted (nums[0]=4 <= nums[3]=7). Is 4 <= 5 < 7? Yes → search left: hi=2.
  • lo=0, hi=2, mid=1. nums[1]=5 == target. Answer: Found at index 1.

Problem 4: Find the Kth largest element (K=2) in [3, 2, 1, 5, 6, 4] without fully sorting.

Solution

Use a min-heap of size K=2:

  • Process 3: heap=[3]
  • Process 2: heap=[2,3]
  • Process 1: 1 < heap[0]=2, skip (or push and pop)
  • Process 5: push, pop min → heap=[3,5]
  • Process 6: push, pop min → heap=[5,6]
  • Process 4: 4 < heap[0]=5, skip Answer: heap[0] = 5 (2nd largest element)

Problem 5: Given weights [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] and D=5 days, find the minimum ship capacity to ship all packages within D days using binary search on the answer.

Solution
  • lo = max(weights) = 10, hi = sum(weights) = 55
  • Binary search: mid=32 → can ship in 2 days (yes), hi=32
  • mid=21 → can ship in 3 days (yes), hi=21
  • mid=15 → can ship in 5 days (yes), hi=15
  • mid=12 → need 6 days > 5 (no), lo=13
  • mid=14 → can ship in 5 days (yes), hi=14
  • mid=13 → need 6 days (no), lo=14 Answer: Minimum capacity = 15

Quick Quiz

1. What is the best sorting algorithm for nearly-sorted data?

  • a) Quick sort
  • b) Merge sort
  • c) Insertion sort
  • d) Heap sort
Answer

c) Insertion sort — Insertion sort has O(n) best-case performance when data is nearly sorted, as elements only need to shift a few positions. All comparison-based O(nlogn) sorts do unnecessary work on nearly-sorted input.

2. Why does quicksort have O(n2) worst-case time complexity?

  • a) The merge step is inefficient
  • b) The pivot selection consistently picks the minimum or maximum element, creating unbalanced partitions
  • c) It uses too much memory
  • d) It cannot handle duplicate elements
Answer

b) The pivot selection consistently picks the minimum or maximum element, creating unbalanced partitions — When the pivot is always the smallest or largest element (e.g., sorted input with first/last pivot), one partition has n1 elements and the other has 0, giving T(n)=T(n1)+O(n)=O(n2).

3. What is the theoretical lower bound for comparison-based sorting?

  • a) O(n)
  • b) O(nlogn)
  • c) O(n2)
  • d) O(logn)
Answer

b) O(nlogn) — The decision tree argument shows that distinguishing between n! possible orderings requires at least log2(n!)=Θ(nlogn) comparisons.

4. In binary search, why should you compute mid as lo + (hi - lo) / 2 instead of (lo + hi) / 2?

  • a) It is faster to compute
  • b) It prevents integer overflow when lo + hi exceeds the maximum integer value
  • c) It gives a different result
  • d) It works better with floating-point numbers
Answer

b) It prevents integer overflow when lo + hi exceeds the maximum integer value — In languages with fixed-size integers (C, C++, Java), lo + hi can overflow. Subtracting first avoids this. In Python and JavaScript this is not strictly necessary, but it is a good habit.

5. When is counting sort preferred over comparison-based sorts?

  • a) When the data is floating-point
  • b) When the data consists of integers in a known, small range
  • c) When stability is not required
  • d) When memory is extremely limited
Answer

b) When the data consists of integers in a known, small range — Counting sort runs in O(n+k) where k is the range of values. When k=O(n), it is linear, beating the O(nlogn) lower bound of comparison sorts.

"What I cannot create, I do not understand." — Richard Feynman