Skip to content
Unverified — AI-generated content. Help verify this page
Backend Engineer0%

Isolation Levels

Isolation is the "I" in ACID, and it is by far the hardest guarantee to understand correctly. Most developers assume that their database handles concurrency "correctly" — and it does, according to the isolation level you chose. The problem is that the default isolation level in most databases permits anomalies that violate the assumptions your application code is silently making.

This page covers every anomaly type with concrete examples, explains exactly what each isolation level guarantees (and what it does not), and digs into how PostgreSQL and MySQL actually implement these levels internally — because the ANSI SQL names are a lie. What MySQL calls REPEATABLE READ is not what the ANSI spec defines, and what PostgreSQL calls SERIALIZABLE is implemented completely differently from what MySQL calls SERIALIZABLE.

If you have ever had a bug caused by concurrent transactions and could not reproduce it in development, this page will explain why.

Why Isolation Matters

Consider a simple banking application. Two transactions execute concurrently:

Transaction A: Transfer $100 from Account 1 to Account 2
  1. Read balance of Account 1 → $500
  2. Read balance of Account 2 → $300
  3. Write Account 1 = $400
  4. Write Account 2 = $400

Transaction B: Transfer $50 from Account 1 to Account 3
  1. Read balance of Account 1 → $500
  2. Read balance of Account 3 → $200
  3. Write Account 1 = $450
  4. Write Account 3 = $250

If these transactions interleave naively, both read Account 1's balance as $500. Transaction A writes $400, then Transaction B writes $450. The result: Account 1 has $450, but $150 was transferred out. You just created $50 out of thin air.

This is a lost update anomaly. The database's isolation level determines whether this can happen.

The Fundamental Tension

Higher isolation means fewer anomalies but lower throughput. The database must coordinate between concurrent transactions, and coordination has a cost:

                    Anomalies Prevented
                    ───────────────────→
   READ             READ          REPEATABLE     SERIALIZABLE
   UNCOMMITTED      COMMITTED     READ

   ←───────────────────
   Throughput / Concurrency

   Every step up in isolation costs you performance.
   The question is: which anomalies can your application tolerate?

The Anomaly Taxonomy

Before understanding isolation levels, you must understand every type of anomaly that can occur when transactions run concurrently. There are six distinct anomaly types, and most developers only know about three of them.

Anomaly 1: Dirty Read

A dirty read occurs when a transaction reads data written by another transaction that has not yet committed. If that other transaction rolls back, the first transaction has read data that never officially existed.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN
UPDATE accounts
  SET balance = 0
  WHERE id = 1
                                 BEGIN
                                 SELECT balance
                                   FROM accounts
                                   WHERE id = 1
                                 → Returns 0 (DIRTY!)

ROLLBACK
                                 -- Transaction B now has
                                 -- a value that never existed
                                 -- It might deny a legitimate
                                 -- purchase based on $0 balance

Real-world impact: An e-commerce system reads a product's price as $0 because another transaction is in the middle of a price update. The order completes at $0 before the update transaction commits. Alternatively, the update transaction rolls back, and the price was never $0 — but the customer got free goods.

Severity: Critical. Dirty reads can cause data corruption. No production database should allow them.

Anomaly 2: Non-Repeatable Read (Fuzzy Read)

A non-repeatable read occurs when a transaction reads the same row twice and gets different values because another transaction modified and committed the row between the two reads.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN
SELECT balance FROM accounts
  WHERE id = 1
→ Returns $500

                                 BEGIN
                                 UPDATE accounts
                                   SET balance = 200
                                   WHERE id = 1
                                 COMMIT

SELECT balance FROM accounts
  WHERE id = 1
→ Returns $200 (DIFFERENT!)

COMMIT

Real-world impact: A reporting transaction reads a user's account balance at the start ($500), performs some calculations assuming $500, then reads it again to verify before inserting into a report. The second read returns $200. The report now contains inconsistent data — calculations based on $500 but a final balance of $200.

Severity: High. Any transaction that reads the same data twice and expects consistent results will break.

Anomaly 3: Phantom Read

A phantom read occurs when a transaction re-executes a query that returns a set of rows and finds that the set has changed because another transaction inserted or deleted rows that match the query's predicate.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN
SELECT COUNT(*) FROM employees
  WHERE department = 'Engineering'
→ Returns 5

                                 BEGIN
                                 INSERT INTO employees
                                   (name, department)
                                   VALUES ('New Hire', 'Engineering')
                                 COMMIT

SELECT COUNT(*) FROM employees
  WHERE department = 'Engineering'
→ Returns 6 (PHANTOM!)

COMMIT

The difference from a non-repeatable read: it is not about an existing row changing, but about new rows appearing (or existing rows disappearing) from a query's result set.

Real-world impact: A payroll system queries all employees in a department to calculate total salary expenses, then queries again to allocate the budget. Between the two queries, a new employee was added. The budget allocation is wrong because it includes the new employee's salary but the total was calculated without it.

Severity: Medium-high. Any transaction that queries a set of rows and makes decisions based on the set membership can be affected.

Anomaly 4: Lost Update

A lost update occurs when two transactions read the same row, both make modifications based on what they read, and the second write overwrites the first — causing the first transaction's update to be silently lost.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN                            BEGIN
SELECT balance FROM accounts     SELECT balance FROM accounts
  WHERE id = 1                     WHERE id = 1
→ Returns $1000                  → Returns $1000

-- Calculate: 1000 - 100 = 900
UPDATE accounts
  SET balance = 900
  WHERE id = 1

                                 -- Calculate: 1000 + 200 = 1200
                                 UPDATE accounts
                                   SET balance = 1200
                                   WHERE id = 1

COMMIT                           COMMIT

The final balance is $1200. Transaction A's withdrawal of $100 was silently lost. The correct balance should be $1100 ($1000 - $100 + $200).

Real-world impact: Two users editing the same wiki page simultaneously. User A saves their changes, then User B saves — overwriting everything User A did. Or: two API servers processing concurrent requests to update a counter, both reading the same value, both incrementing by 1, resulting in an increment of 1 instead of 2.

Severity: Critical. Lost updates corrupt data silently. They are particularly dangerous because they are hard to detect — the data looks valid, it is just wrong.

Anomaly 5: Read Skew

A read skew occurs when a transaction reads two related pieces of data at different points in time, seeing an inconsistent state because another transaction modified them in between.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN
SELECT balance FROM accounts
  WHERE id = 1
→ Returns $500
                                 BEGIN
                                 -- Transfer $100 from Account 1 to Account 2
                                 UPDATE accounts SET balance = 400
                                   WHERE id = 1
                                 UPDATE accounts SET balance = 400
                                   WHERE id = 2
                                 COMMIT

SELECT balance FROM accounts
  WHERE id = 2
→ Returns $400

-- Transaction A sees:
-- Account 1: $500, Account 2: $400
-- Total: $900 (should be $800)
-- INCONSISTENT SNAPSHOT!

Real-world impact: A backup process reads Table A, then reads Table B. Between the two reads, a transaction moves data from Table A to Table B. The backup now has data duplicated or missing — restoring from this backup will produce an inconsistent database.

Severity: High. Read skew can cause incorrect analytics, inconsistent backups, and broken business logic that depends on relationships between data.

Anomaly 6: Write Skew

A write skew occurs when two transactions each read something, make a decision based on what they read, and then write to different rows — but the combination of both writes violates an invariant that each transaction individually checked.

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN                            BEGIN
SELECT COUNT(*) FROM doctors     SELECT COUNT(*) FROM doctors
  WHERE on_call = true             WHERE on_call = true
  AND shift = 'night'              AND shift = 'night'
→ Returns 2                      → Returns 2

-- "2 doctors on call, safe for
--  me to go off call"
UPDATE doctors SET on_call = false
  WHERE id = 'alice'
  AND shift = 'night'

                                 -- "2 doctors on call, safe for
                                 --  me to go off call"
                                 UPDATE doctors SET on_call = false
                                   WHERE id = 'bob'
                                   AND shift = 'night'

COMMIT                           COMMIT

-- Result: 0 doctors on call for the night shift!
-- Invariant violated: there must always be >= 1 doctor on call

Write skew is particularly insidious because each transaction individually appears correct — it checked the invariant and it held. The problem only manifests when both transactions commit.

Real-world impact: Meeting room double-booking. Both transactions check "is the room free at 2 PM?", both see "yes", and both insert a booking. Or: both transactions check "is the username available?", both see "yes", and both create the account.

Severity: Critical. Write skew is the most subtle anomaly and the hardest to prevent. It is also the most commonly encountered in real production systems because most databases use Snapshot Isolation by default, which prevents everything except write skew.

The Four ANSI SQL Isolation Levels

The SQL standard (SQL-92) defines four isolation levels. Each prevents a different set of anomalies. Here is the official definition:

Isolation LevelDirty ReadNon-Repeatable ReadPhantom Read
READ UNCOMMITTEDPossiblePossiblePossible
READ COMMITTEDPreventedPossiblePossible
REPEATABLE READPreventedPreventedPossible
SERIALIZABLEPreventedPreventedPrevented

But this table is dangerously incomplete. It only mentions three anomaly types and says nothing about lost updates, read skew, or write skew. It also says nothing about how the levels are implemented — and the implementation determines the real behavior.

The ANSI Table Is Misleading

The ANSI standard defines isolation levels purely in terms of the three anomalies above. It says nothing about write skew, lost update, or read skew. It also defines the levels as minimum requirements — a database can provide stronger guarantees than the level name implies. This is exactly what PostgreSQL and MySQL do.

Level 1: READ UNCOMMITTED

What it means: Transactions can read uncommitted changes from other transactions.

What it prevents: Nothing. All anomalies are possible.

How it is implemented:

  • Lock-based: No shared locks are acquired for reads. Writes acquire exclusive locks that are held until commit. Reads do not wait for any locks.
  • MVCC-based: In theory, transactions would read the latest version regardless of commit status. In practice, PostgreSQL does not implement READ UNCOMMITTED at all — it silently upgrades to READ COMMITTED.

Performance: Fastest possible. No coordination overhead for reads.

When to use: Almost never. The only legitimate use case is for approximate monitoring queries on extremely high-throughput systems where reading slightly wrong data is acceptable (e.g., "roughly how many active sessions right now?"). Even then, READ COMMITTED is almost always a better choice.

sql
-- PostgreSQL: silently upgrades to READ COMMITTED
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;

-- MySQL: actually provides READ UNCOMMITTED
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
-- This will let you read uncommitted rows from other transactions

Level 2: READ COMMITTED

What it means: Transactions can only read data that has been committed. Each SQL statement within the transaction sees a fresh snapshot — but different statements can see different snapshots.

What it prevents: Dirty reads.

What it allows: Non-repeatable reads, phantom reads, lost updates, read skew, write skew.

How it is implemented:

  • Lock-based (SQL Server default): Shared locks are acquired for reads but released immediately after the read (not held until commit). Writes acquire exclusive locks held until commit.
  • MVCC-based (PostgreSQL default, Oracle default): Each statement gets a new snapshot reflecting all transactions committed before the statement started. Different statements in the same transaction can see different committed data.

Performance: Very good. Reads rarely block. The per-statement snapshot means writers never block readers and readers never block writers.

The per-statement snapshot behavior is key:

Transaction A                    Transaction B
─────────────                    ─────────────
BEGIN (snapshot S1)

SELECT * FROM orders
  WHERE status = 'pending'
  -- Uses snapshot from THIS statement
  -- Sees 10 pending orders

                                 BEGIN
                                 UPDATE orders SET status = 'shipped'
                                   WHERE id = 42
                                 COMMIT

SELECT * FROM orders
  WHERE status = 'pending'
  -- Uses NEW snapshot from THIS statement
  -- Sees 9 pending orders (42 is now shipped)
  -- NON-REPEATABLE READ!

COMMIT

When to use: READ COMMITTED is the default in PostgreSQL and Oracle for good reason — it provides a practical balance between consistency and performance for most OLTP workloads. It is suitable when:

  • Individual statements need to see committed data
  • You do not need repeatable reads within a transaction
  • Your transactions are short-lived (typical web request patterns)
  • You handle concurrency conflicts at the application level (optimistic locking)

Level 3: REPEATABLE READ

What it means: Once a transaction reads a row, it will see the same value for that row throughout the entire transaction, even if other transactions modify and commit changes to it.

What it prevents: Dirty reads, non-repeatable reads, lost updates (in most implementations).

What it allows (per ANSI spec): Phantom reads, write skew, read skew.

How it is implemented — and here is where it gets interesting:

  • Lock-based (SQL Server): Shared locks acquired for reads are held until the end of the transaction (not released after the read like in READ COMMITTED). This physically prevents other transactions from modifying read rows. But gaps between rows are not locked, so phantom inserts are possible.

  • MVCC-based (PostgreSQL, MySQL InnoDB): The transaction gets a single snapshot at the start (at the time of the first query in PostgreSQL, or at the start of the transaction in MySQL). All reads see this snapshot, regardless of what other transactions commit. This naturally prevents non-repeatable reads and read skew.

The critical difference between lock-based and MVCC-based REPEATABLE READ:

Lock-based REPEATABLE READ literally prevents other transactions from modifying rows you have read. MVCC-based REPEATABLE READ allows other transactions to modify those rows — your transaction just does not see the changes because it reads from its frozen snapshot.

This has a profound implication for writes:

sql
-- Under MVCC REPEATABLE READ:
BEGIN;  -- Snapshot taken here

SELECT balance FROM accounts WHERE id = 1;
-- Returns $500 (from snapshot)

-- Meanwhile, another transaction changed balance to $300 and committed.

UPDATE accounts SET balance = balance - 100 WHERE id = 1;
-- PostgreSQL: ERROR! Detects the conflict, aborts the transaction
-- MySQL: Silently uses the LATEST committed value ($300),
--        sets balance = $200. YOUR SNAPSHOT IS IGNORED FOR WRITES.

MySQL's REPEATABLE READ Write Behavior

MySQL InnoDB's REPEATABLE READ reads from the snapshot but writes against the latest committed data. This means a SELECT sees one value and an UPDATE operates on a different value. This is a source of subtle bugs. PostgreSQL detects this conflict and aborts the transaction.

Level 4: SERIALIZABLE

What it means: Transactions execute as if they were serial — one after another, with no concurrency. The result of concurrent execution is equivalent to some serial ordering of the transactions.

What it prevents: All anomalies. Dirty reads, non-repeatable reads, phantom reads, lost updates, read skew, write skew — everything.

How it is implemented:

  • Lock-based (MySQL InnoDB, SQL Server): All reads acquire shared locks that are held until commit. Range locks or predicate locks prevent phantom insertions. Writes acquire exclusive locks held until commit. This is essentially two-phase locking (2PL).

  • SSI-based (PostgreSQL): PostgreSQL uses Serializable Snapshot Isolation (SSI), which is a fundamentally different approach. It runs transactions on snapshots (like REPEATABLE READ) but tracks read-write dependencies to detect potential serialization anomalies. If a dangerous pattern is detected, one of the conflicting transactions is aborted. More on this below.

Performance: Lowest throughput due to maximum coordination. Lock-based serializable can cause significant contention and deadlocks. SSI-based serializable (PostgreSQL) has better concurrency but more transaction aborts.

sql
-- PostgreSQL SERIALIZABLE with SSI
BEGIN ISOLATION LEVEL SERIALIZABLE;

SELECT COUNT(*) FROM doctors
  WHERE on_call = true AND shift = 'night';
-- Returns 2

-- PostgreSQL tracks: this transaction read the set of rows
-- matching (on_call = true AND shift = 'night')

UPDATE doctors SET on_call = false WHERE id = 'alice';
-- PostgreSQL tracks: this transaction wrote to a row
-- that matches another transaction's read predicate

COMMIT;
-- If another concurrent SERIALIZABLE transaction also went off call,
-- PostgreSQL detects the rw-conflict cycle and aborts one of them:
-- ERROR: could not serialize access due to read/write dependencies

When to use: Financial transactions, inventory management, any system where invariants must be maintained across concurrent transactions. Use it when write skew would violate a critical business rule.

Anomaly Prevention Matrix — Complete

Here is the complete matrix showing which anomalies are prevented at each level, including the ones the ANSI spec forgot:

AnomalyREAD UNCOMMITTEDREAD COMMITTEDREPEATABLE READ (MVCC)SERIALIZABLE
Dirty ReadPossiblePreventedPreventedPrevented
Non-Repeatable ReadPossiblePossiblePreventedPrevented
Phantom ReadPossiblePossibleDepends*Prevented
Lost UpdatePossiblePossiblePrevented (mostly)Prevented
Read SkewPossiblePossiblePreventedPrevented
Write SkewPossiblePossiblePossiblePrevented

*Phantoms: PostgreSQL's REPEATABLE READ prevents phantoms via snapshot isolation. MySQL's REPEATABLE READ prevents phantoms via gap locks for locking reads but allows them for consistent (snapshot) reads. ANSI spec says REPEATABLE READ allows phantoms.

PostgreSQL Isolation Implementation

PostgreSQL implements isolation exclusively through MVCC — it never uses lock-based isolation for these levels. Understanding PostgreSQL's specific implementation is critical because it behaves differently from both the ANSI spec and other databases.

PostgreSQL's MVCC Architecture

Every row in PostgreSQL has hidden system columns:

| xmin | xmax | cmin | cmax | actual_data... |
  • xmin: The transaction ID that inserted this row version
  • xmax: The transaction ID that deleted/updated this row version (0 if still live)
  • cmin/cmax: Command ID within the transaction (for seeing your own uncommitted changes)

When a row is updated, PostgreSQL does not modify the existing row. It inserts a new row version with a new xmin and marks the old version's xmax. Both versions exist simultaneously. Which version a transaction sees depends on its snapshot.

Snapshot Mechanics

A snapshot in PostgreSQL contains:

Snapshot = {
  xmin: 100,          -- Oldest active transaction at snapshot time
  xmax: 105,          -- Next transaction ID to be assigned at snapshot time
  xip:  [101, 103]    -- Transaction IDs that were active (in-progress) at snapshot time
}

Visibility rule: A row version is visible to a transaction with snapshot S if:

  1. The row's xmin is committed AND xmin < S.xmin, OR
  2. The row's xmin is committed AND S.xmin <= xmin < S.xmax AND xmin is NOT in S.xip
  3. AND the row's xmax is 0 (not deleted), OR xmax is not committed, OR xmax is a transaction that was active in S

In plain English: "Show me rows created by transactions that were committed before my snapshot was taken, and not deleted by any committed transaction."

READ COMMITTED in PostgreSQL

Each SQL statement takes a new snapshot. The snapshot includes all transactions committed before the statement started.

sql
BEGIN;

-- Statement 1: snapshot at time T1
SELECT * FROM products WHERE price > 100;
-- Sees all rows committed before T1

-- Another transaction commits a price change at T2

-- Statement 2: snapshot at time T3 (T3 > T2)
SELECT * FROM products WHERE price > 100;
-- Sees different results! Includes the committed change.

COMMIT;

PostgreSQL-specific behavior for UPDATE under READ COMMITTED:

When an UPDATE statement finds that a target row has been modified by a concurrent transaction, it does not simply use its statement snapshot. Instead:

  1. The UPDATE waits for the concurrent transaction to commit or abort
  2. If the concurrent transaction committed, the UPDATE re-evaluates the WHERE clause against the new row version
  3. If the row still matches the WHERE clause, it updates the new version
  4. If the row no longer matches, it skips the row

This "re-evaluation" behavior is crucial and surprising:

sql
-- Transaction A                    -- Transaction B
BEGIN;                               BEGIN;
                                     UPDATE accounts SET balance = 0
                                       WHERE id = 1;
UPDATE accounts SET balance = balance + 100
  WHERE id = 1 AND balance > 50;
-- BLOCKS waiting for Transaction B

                                     COMMIT;  -- balance is now 0

-- Re-evaluates WHERE clause: balance > 50?
-- 0 > 50? FALSE. Skips the row!
-- UPDATE 0 (no rows updated)

COMMIT;

REPEATABLE READ in PostgreSQL

A single snapshot is taken at the time of the first query in the transaction. All subsequent reads use this snapshot.

sql
BEGIN ISOLATION LEVEL REPEATABLE READ;

-- First query: snapshot taken HERE
SELECT * FROM orders WHERE status = 'pending';
-- Returns 10 rows

-- Another transaction inserts a new pending order and commits

SELECT * FROM orders WHERE status = 'pending';
-- Still returns 10 rows (same snapshot!)

-- Another transaction updates order #5 status to 'shipped' and commits

SELECT * FROM orders WHERE id = 5;
-- Still shows status = 'pending' (same snapshot!)

COMMIT;

Write conflict detection:

Unlike MySQL, PostgreSQL's REPEATABLE READ will abort a transaction if it tries to update a row that has been modified by a concurrent committed transaction:

sql
BEGIN ISOLATION LEVEL REPEATABLE READ;

SELECT * FROM accounts WHERE id = 1;
-- Returns balance = $500

-- Concurrent transaction updates balance to $300 and commits

UPDATE accounts SET balance = 400 WHERE id = 1;
-- ERROR: could not serialize access due to concurrent update
-- Transaction is aborted. You must retry.

This is a significant safety feature. PostgreSQL refuses to let you update based on stale snapshot data.

SERIALIZABLE in PostgreSQL — SSI

PostgreSQL's SERIALIZABLE level uses Serializable Snapshot Isolation (SSI), a groundbreaking algorithm based on the research of Michael Cahill, Uwe Rohm, and Alan Fekete (2008).

SSI works by:

  1. Running transactions on snapshots (like REPEATABLE READ)
  2. Tracking read-write dependencies between concurrent transactions
  3. Detecting "dangerous structures" — patterns that could lead to serialization anomalies
  4. Aborting one transaction when a dangerous structure is detected

The dangerous structure SSI looks for is a cycle in the serialization graph containing two consecutive rw-dependency edges (also called "rw-antidependency" or "pivot" edges).

What is an rw-dependency? Transaction T1 has an rw-dependency on T2 if T1 read a version of a row that T2 later wrote (updated or deleted), or if T1 read a set of rows matching a predicate and T2 later inserted a row matching that predicate.

The dangerous structure:

  T1 ──rw──→ T2 ──rw──→ T3

  T1 read something that T2 overwrote.
  T2 read something that T3 overwrote.
  If T3 == T1, this forms a cycle.

  T1 ──rw──→ T2 ──rw──→ T1   ← CYCLE! One must abort.

Concrete example — write skew detection:

sql
-- Transaction A (SERIALIZABLE)
BEGIN;
SELECT * FROM doctors WHERE on_call = true AND shift = 'night';
-- Reads: Alice (on_call=true), Bob (on_call=true)
-- PostgreSQL records: T_A read the predicate (on_call=true AND shift='night')

UPDATE doctors SET on_call = false WHERE id = 'alice';
-- PostgreSQL records: T_A wrote to the row for Alice

-- Transaction B (SERIALIZABLE, concurrent)
BEGIN;
SELECT * FROM doctors WHERE on_call = true AND shift = 'night';
-- Reads: Alice (on_call=true), Bob (on_call=true)
-- From B's snapshot (taken before A committed), Alice is still on call
-- PostgreSQL records: T_B read the predicate (on_call=true AND shift='night')

UPDATE doctors SET on_call = false WHERE id = 'bob';
-- PostgreSQL records: T_B wrote to the row for Bob

-- Now:
-- T_A read rows matching the predicate → T_B wrote a row matching the predicate (Bob)
--   This is an rw-dependency: T_A ──rw──→ T_B
-- T_B read rows matching the predicate → T_A wrote a row matching the predicate (Alice)
--   This is an rw-dependency: T_B ──rw──→ T_A

-- Cycle detected: T_A ──rw──→ T_B ──rw──→ T_A

COMMIT; -- For Transaction A: succeeds
COMMIT; -- For Transaction B: ERROR: could not serialize access
         -- due to read/write dependencies among transactions

SSI implementation details in PostgreSQL:

PostgreSQL tracks dependencies using two types of locks:

  1. SIRead locks (predicate locks): Lightweight locks that do not block. They record what each transaction has read — at the tuple, page, or relation level. They are not traditional locks; they are markers for dependency tracking.

  2. SIREAD lock promotion: To control memory usage, PostgreSQL promotes fine-grained locks to coarser ones:

    • Tuple-level → Page-level (when too many tuples on one page are locked)
    • Page-level → Relation-level (when too many pages in one relation are locked)
    • This can cause false positives: transactions might be aborted unnecessarily because the coarser lock indicates a conflict that does not actually exist.
  3. Conflict detection: When a transaction writes, PostgreSQL checks for SIRead locks held by other transactions on the affected data. If found, it records an rw-dependency. If the dependency graph contains the dangerous structure, one transaction is chosen as the victim and aborted.

sql
-- Tuning SSI behavior in PostgreSQL:
-- Increase if you're getting too many false-positive serialization failures:
SET max_pred_locks_per_transaction = 64;       -- default
SET max_pred_locks_per_relation = -2;          -- default: -2 means (max_pred_locks / -value)
SET max_pred_locks_per_page = 2;               -- default

-- Monitor serialization failures:
SELECT count(*) FROM pg_stat_database
  WHERE datname = 'mydb'
  AND conflicts > 0;

MySQL InnoDB Isolation Implementation

MySQL InnoDB takes a fundamentally different approach from PostgreSQL. While it uses MVCC for consistent reads, it relies heavily on locking for write operations and uses a unique combination of gap locks and next-key locks.

InnoDB's MVCC: Undo Log Based

Unlike PostgreSQL (which stores multiple row versions in the main table), InnoDB stores only the latest version in the clustered index and uses the undo log to reconstruct older versions:

Clustered Index (latest version):
  Row: id=1, balance=300, trx_id=150, roll_ptr → undo_log

Undo Log:
  [trx_id=150]: balance was 500 before this update
  [trx_id=140]: balance was 400 before this update
  [trx_id=130]: this was the INSERT (balance=400)

When a transaction with an older snapshot needs to read this row, InnoDB follows the undo log chain backwards to reconstruct the version that was current at the snapshot's point in time.

Implication: Long-running transactions in InnoDB prevent undo log purging, causing the undo tablespace to grow. In PostgreSQL, long-running transactions prevent dead tuple cleanup (VACUUM), causing table bloat. Different mechanisms, same operational problem.

READ COMMITTED in InnoDB

InnoDB's READ COMMITTED behaves similarly to PostgreSQL's: each statement gets a fresh snapshot.

But there is a critical difference in locking behavior:

sql
-- Under READ COMMITTED in InnoDB:
UPDATE employees SET salary = salary * 1.1
  WHERE department = 'Engineering';

-- InnoDB behavior:
-- 1. Scans the table (or uses an index)
-- 2. Locks each row it examines
-- 3. For rows that DON'T match the WHERE clause,
--    InnoDB releases the lock IMMEDIATELY
-- 4. For rows that DO match, the lock is held until COMMIT

-- Under REPEATABLE READ in InnoDB:
-- Same UPDATE, but InnoDB holds locks on ALL examined rows,
-- even those that don't match the WHERE clause.
-- AND it adds gap locks to prevent phantom inserts.

REPEATABLE READ in InnoDB — Gap Locks and Next-Key Locks

This is where MySQL gets really interesting. InnoDB's REPEATABLE READ actually prevents phantom reads for locking operations — which goes beyond the ANSI spec.

InnoDB's lock types:

  1. Record lock: Locks a single index record
  2. Gap lock: Locks the gap between index records (prevents insertions)
  3. Next-key lock: Record lock + gap lock on the gap before the record
Index records: 10, 20, 30, 40

Record locks:    [10] [20] [30] [40]   ← Lock individual records
Gap locks:      (,10)(10,20)(20,30)(30,40)(40,)  ← Lock gaps
Next-key locks: (,10] (10,20] (20,30] (30,40]     ← Gap + record

How gap locks prevent phantoms:

sql
-- Transaction A
BEGIN;
SELECT * FROM orders WHERE amount BETWEEN 100 AND 200 FOR UPDATE;
-- InnoDB places next-key locks on the range [100, 200]
-- This locks:
--   1. All existing rows with amount between 100 and 200
--   2. The GAPS between those rows
--   3. The gap before 100 and after 200 (to the next indexed value)

-- Transaction B
INSERT INTO orders (amount) VALUES (150);
-- BLOCKS! The gap lock prevents insertion into the locked range.
-- Transaction B must wait for Transaction A to commit or abort.

The catch — consistent reads vs locking reads:

InnoDB's REPEATABLE READ has split behavior:

  • Consistent reads (plain SELECT): Use the snapshot. Phantoms are not visible (because you are reading from a fixed point in time), but the phantom row exists in the actual table.
  • Locking reads (SELECT ... FOR UPDATE, SELECT ... LOCK IN SHARE MODE): Use the latest committed data, not the snapshot. Gap locks prevent phantom inserts.
sql
-- This is a source of confusion:
BEGIN;  -- REPEATABLE READ

SELECT * FROM orders WHERE status = 'pending';
-- Returns 10 rows (from snapshot)

-- Another transaction inserts a new pending order and commits

SELECT * FROM orders WHERE status = 'pending';
-- Still returns 10 rows (consistent read, same snapshot) ✓

SELECT * FROM orders WHERE status = 'pending' FOR UPDATE;
-- Returns 11 rows! (locking read, sees latest committed data)
-- INCONSISTENT WITH THE PREVIOUS SELECT

InnoDB's Split Personality

Under REPEATABLE READ, a plain SELECT and a SELECT FOR UPDATE in the same transaction can return different results. The plain SELECT uses the transaction snapshot; the SELECT FOR UPDATE sees the latest committed data. This is a common source of bugs when developers mix consistent and locking reads.

SERIALIZABLE in InnoDB

InnoDB's SERIALIZABLE level converts all plain SELECT statements to SELECT ... LOCK IN SHARE MODE (as of MySQL 8.0, SELECT ... FOR SHARE). Every read acquires a shared lock, and every write acquires an exclusive lock. All locks are held until commit.

This is strict two-phase locking (S2PL):

Phase 1 (Growing): Acquire locks, never release
Phase 2 (Shrinking): Release all locks at commit/abort

Timeline:
  ┌─── Growing Phase ────┐┌── Shrinking Phase ──┐
  │ Lock A, Lock B, Lock C ││ Release A, B, C       │
  │ (never release during  ││ (all at commit)        │
  │  this phase)           ││                        │

Deadlocks are much more likely at SERIALIZABLE: Because all reads acquire shared locks and hold them, two transactions reading and writing overlapping data will frequently deadlock. InnoDB has a deadlock detector that periodically checks the wait-for graph and aborts one transaction.

sql
-- Deadlock example under SERIALIZABLE:
-- Transaction A                    -- Transaction B
SELECT * FROM accounts              SELECT * FROM accounts
  WHERE id = 1;                       WHERE id = 2;
-- Acquires S-lock on row 1          -- Acquires S-lock on row 2

UPDATE accounts SET balance = 0      UPDATE accounts SET balance = 0
  WHERE id = 2;                        WHERE id = 1;
-- Needs X-lock on row 2             -- Needs X-lock on row 1
-- BLOCKED by B's S-lock             -- BLOCKED by A's S-lock
-- DEADLOCK!

Snapshot Isolation vs Serializable

Snapshot Isolation (SI) is not an ANSI-defined isolation level, but it is what most databases actually provide. Understanding the gap between SI and true serializability is critical.

What Snapshot Isolation Provides

Under SI, every transaction reads from a consistent snapshot of the database taken at the start of the transaction. Two transactions can execute concurrently as long as their write sets do not overlap (first-committer-wins rule).

SI prevents:

  • Dirty reads (reads from snapshot, which only contains committed data)
  • Non-repeatable reads (always reads from the same snapshot)
  • Phantom reads (snapshot is fixed, new rows do not appear)
  • Read skew (snapshot is consistent)
  • Lost updates (first-committer-wins detects conflicting writes to the same row)

SI does NOT prevent:

  • Write skew (transactions write to different rows, so first-committer-wins does not trigger)

The Write Skew Problem in Detail

Write skew is the reason SI is not serializable. Let's look at a detailed example:

The On-Call Doctor Problem (revisited with code):

Database state: doctors table
| id    | name   | on_call | shift |
|-------|--------|---------|-------|
| alice | Alice  | true    | night |
| bob   | Bob    | true    | night |
| carol | Carol  | false   | night |

Invariant: COUNT(*) WHERE on_call = true AND shift = 'night' >= 1
sql
-- Transaction A (Alice wants to go off call)
BEGIN ISOLATION LEVEL REPEATABLE READ;  -- This is actually SI in PostgreSQL
SELECT COUNT(*) FROM doctors WHERE on_call = true AND shift = 'night';
-- Returns 2 (Alice and Bob). Invariant holds: 2 >= 1.
-- Decision: Safe to go off call.
UPDATE doctors SET on_call = false WHERE id = 'alice';

-- Transaction B (Bob wants to go off call) — concurrent
BEGIN ISOLATION LEVEL REPEATABLE READ;
SELECT COUNT(*) FROM doctors WHERE on_call = true AND shift = 'night';
-- Returns 2 (from B's snapshot, taken before A's update). Invariant holds: 2 >= 1.
-- Decision: Safe to go off call.
UPDATE doctors SET on_call = false WHERE id = 'bob';

-- Both commit successfully!
-- Transaction A wrote to Alice's row. Transaction B wrote to Bob's row.
-- No write-write conflict → first-committer-wins does not trigger.

-- Result: 0 doctors on call. Invariant violated!

Why SI misses this: SI's first-committer-wins rule only detects when two transactions write to the same row. Write skew involves transactions writing to different rows based on a shared read — SI has no mechanism to detect this.

Where SI Matches the ANSI Level Names

This is the confusing part. Different databases map SI to different ANSI level names:

DatabaseANSI Level NameActual Implementation
PostgreSQLREPEATABLE READSnapshot Isolation
PostgreSQLSERIALIZABLESerializable Snapshot Isolation (SSI)
MySQL InnoDBREPEATABLE READSnapshot Isolation + Gap Locks
OracleSERIALIZABLESnapshot Isolation (NOT truly serializable!)
SQL ServerSNAPSHOTSnapshot Isolation (separate from the ANSI levels)
SQL ServerSERIALIZABLELock-based (S2PL, truly serializable)

Oracle's "SERIALIZABLE" Is NOT Serializable

Oracle's SERIALIZABLE isolation level is actually Snapshot Isolation. It does NOT prevent write skew. Oracle does not provide true serializability at all. If you need to prevent write skew in Oracle, you must use explicit locking (SELECT FOR UPDATE).

Serializable Snapshot Isolation (SSI) Deep Dive

SSI, as implemented in PostgreSQL, is the state-of-the-art approach to achieving true serializability without the performance penalty of lock-based methods.

The Theory: Serialization Graphs

For a set of concurrent transactions to be serializable, there must exist a serial order that produces the same result. We can model dependencies between transactions as a directed graph:

  • wr-dependency (write-read): T2 reads a value written by T1. In any equivalent serial order, T1 must come before T2.
  • ww-dependency (write-write): T2 overwrites a value written by T1. In any equivalent serial order, T1 must come before T2.
  • rw-antidependency: T1 reads a value that T2 later overwrites. In any equivalent serial order, T1 must come before T2 — but this conflicts with the snapshot order (T2 was concurrent with T1, so T1's snapshot does not see T2's write).

The key theorem (Cahill, Rohm, Fekete 2008): Under Snapshot Isolation, a non-serializable execution always contains two consecutive rw-antidependency edges in the serialization graph that form a "dangerous structure":

Dangerous Structure:

    T_pivot
   ↗ rw    rw ↘
T_in          T_out

Where:
  - T_in has an rw-dependency on T_pivot (T_in read, T_pivot overwrote)
  - T_pivot has an rw-dependency on T_out (T_pivot read, T_out overwrote)
  - T_in and T_out committed "around" T_pivot in a way that creates a cycle

SSI detects these dangerous structures and aborts one transaction to break the cycle.

SSI vs S2PL Performance

SSI has fundamental advantages over strict two-phase locking:

PropertyS2PL (MySQL SERIALIZABLE)SSI (PostgreSQL SERIALIZABLE)
Readers block writersYesNo
Writers block readersYesNo
DeadlocksPossible (common)Impossible (no blocking locks)
False abortsNoYes (due to lock promotion)
Long-running readsBlock all conflicting writesNo blocking, but might be aborted at commit
Throughput (low contention)GoodBetter
Throughput (high contention)Poor (blocking)Moderate (aborts + retries)

Benchmark data (from research and production measurements):

Workload: TPC-C-like OLTP, 16 cores, 64 concurrent clients

                    Throughput (transactions/sec)
                    ─────────────────────────────
READ COMMITTED:      12,400 tps
REPEATABLE READ:     11,800 tps  (−5% from RC)
SERIALIZABLE (SSI):   9,200 tps  (−26% from RC)
SERIALIZABLE (S2PL):  5,600 tps  (−55% from RC)

Abort rate at SERIALIZABLE (SSI): ~3% of transactions
Deadlock rate at SERIALIZABLE (S2PL): ~8% of transactions

SSI's throughput is roughly 60-70% of READ COMMITTED, while S2PL drops to about 45%. The advantage of SSI grows with more concurrent clients because S2PL's blocking causes cascading waits.

When SSI Aborts Unnecessarily

SSI can produce false positives — aborting transactions that would have been serializable. This happens because of predicate lock granularity:

sql
-- Transaction A
SELECT * FROM products WHERE category = 'Electronics' AND price < 100;
-- PostgreSQL might take a page-level or relation-level predicate lock
-- if there are too many matching tuples for tuple-level locks

-- Transaction B
INSERT INTO products (category, price) VALUES ('Clothing', 50);
-- This does NOT conflict with A's query (different category!)
-- But if A's predicate lock was promoted to a page-level lock,
-- and B's insert lands on the same page, PostgreSQL sees a conflict.
-- FALSE POSITIVE: one of them might be aborted unnecessarily.

Mitigation strategies:

  1. Increase max_pred_locks_per_transaction to delay lock promotion
  2. Keep transactions short to reduce the window for false conflicts
  3. Design your schema so that conflicting operations touch different pages
  4. Always handle serialization failures with retry logic

TypeScript: Simulating Each Anomaly Type

The following TypeScript code demonstrates each anomaly type using a simulated database with configurable isolation levels. This is pedagogical code — real databases implement these mechanisms in C at the storage engine level.

typescript
// ============================================================
// Isolation Level Anomaly Simulator
// Demonstrates each anomaly type with a mock database
// ============================================================

type Row = Record<string, unknown>;
type Snapshot = Map<string, Row[]>;
type LockType = 'shared' | 'exclusive';

interface Lock {
  txId: string;
  type: LockType;
  key: string;
}

type IsolationLevel =
  | 'READ_UNCOMMITTED'
  | 'READ_COMMITTED'
  | 'REPEATABLE_READ'
  | 'SERIALIZABLE';

class MockDatabase {
  private tables: Map<string, Row[]> = new Map();
  private uncommitted: Map<string, Map<string, Row[]>> = new Map();
  private snapshots: Map<string, Snapshot> = new Map();
  private locks: Lock[] = [];
  private txCounter = 0;
  private committedVersions: Map<string, Row[][]> = new Map();

  constructor() {
    this.tables = new Map();
  }

  seed(tableName: string, rows: Row[]): void {
    this.tables.set(tableName, JSON.parse(JSON.stringify(rows)));
  }

  beginTransaction(isolation: IsolationLevel): string {
    const txId = `tx_${++this.txCounter}`;
    this.uncommitted.set(txId, new Map());

    if (isolation === 'REPEATABLE_READ' || isolation === 'SERIALIZABLE') {
      // Take snapshot at transaction start
      const snapshot: Snapshot = new Map();
      for (const [table, rows] of this.tables) {
        snapshot.set(table, JSON.parse(JSON.stringify(rows)));
      }
      this.snapshots.set(txId, snapshot);
    }

    console.log(`  [${txId}] BEGIN (${isolation})`);
    return txId;
  }

  read(
    txId: string,
    tableName: string,
    isolation: IsolationLevel,
    predicate?: (row: Row) => boolean
  ): Row[] {
    let source: Row[];

    switch (isolation) {
      case 'READ_UNCOMMITTED': {
        // Check uncommitted changes from ALL transactions first
        const allUncommitted = this.getAllUncommittedForTable(tableName);
        source = allUncommitted ?? this.tables.get(tableName) ?? [];
        break;
      }
      case 'READ_COMMITTED': {
        // Fresh snapshot of committed data for each read
        source = JSON.parse(
          JSON.stringify(this.tables.get(tableName) ?? [])
        );
        break;
      }
      case 'REPEATABLE_READ':
      case 'SERIALIZABLE': {
        // Use transaction's snapshot
        const snapshot = this.snapshots.get(txId);
        source = snapshot
          ? JSON.parse(JSON.stringify(snapshot.get(tableName) ?? []))
          : JSON.parse(JSON.stringify(this.tables.get(tableName) ?? []));
        break;
      }
    }

    const result = predicate ? source.filter(predicate) : source;
    console.log(
      `  [${txId}] READ ${tableName}: ${JSON.stringify(result)}`
    );
    return result;
  }

  write(
    txId: string,
    tableName: string,
    predicate: (row: Row) => boolean,
    updater: (row: Row) => Row,
    _isolation: IsolationLevel
  ): void {
    // Get current committed data and apply update
    const rows = JSON.parse(
      JSON.stringify(this.tables.get(tableName) ?? [])
    );
    const updated = rows.map((row: Row) =>
      predicate(row) ? updater(JSON.parse(JSON.stringify(row))) : row
    );

    const txUncommitted = this.uncommitted.get(txId)!;
    txUncommitted.set(tableName, updated);

    console.log(
      `  [${txId}] WRITE ${tableName}: ${JSON.stringify(
        updated.filter(predicate)
      )}`
    );
  }

  insert(txId: string, tableName: string, row: Row): void {
    const current = JSON.parse(
      JSON.stringify(this.tables.get(tableName) ?? [])
    );
    current.push(row);

    const txUncommitted = this.uncommitted.get(txId)!;
    txUncommitted.set(tableName, current);

    console.log(`  [${txId}] INSERT into ${tableName}: ${JSON.stringify(row)}`);
  }

  commit(txId: string): void {
    const txUncommitted = this.uncommitted.get(txId);
    if (txUncommitted) {
      for (const [table, rows] of txUncommitted) {
        this.tables.set(table, rows);
      }
    }
    this.uncommitted.delete(txId);
    this.snapshots.delete(txId);
    this.locks = this.locks.filter((l) => l.txId !== txId);
    console.log(`  [${txId}] COMMIT`);
  }

  rollback(txId: string): void {
    this.uncommitted.delete(txId);
    this.snapshots.delete(txId);
    this.locks = this.locks.filter((l) => l.txId !== txId);
    console.log(`  [${txId}] ROLLBACK`);
  }

  getTable(tableName: string): Row[] {
    return JSON.parse(JSON.stringify(this.tables.get(tableName) ?? []));
  }

  private getAllUncommittedForTable(tableName: string): Row[] | null {
    // Return the latest uncommitted version from any transaction
    for (const [, txData] of this.uncommitted) {
      const data = txData.get(tableName);
      if (data) return JSON.parse(JSON.stringify(data));
    }
    return null;
  }
}

// ============================================================
// Anomaly Demonstrations
// ============================================================

function demonstrateDirtyRead(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Dirty Read');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('accounts', [{ id: 1, balance: 1000 }]);

  console.log('\n--- READ UNCOMMITTED (dirty read occurs) ---');
  const txA = db.beginTransaction('READ_UNCOMMITTED');
  db.write(
    txA, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 0 }),
    'READ_UNCOMMITTED'
  );
  // Transaction A has NOT committed yet

  const txB = db.beginTransaction('READ_UNCOMMITTED');
  const dirtyResult = db.read(txB, 'accounts', 'READ_UNCOMMITTED',
    (r) => r.id === 1
  );
  console.log(
    `  [${txB}] Read balance: ${dirtyResult[0]?.balance} ` +
    `(DIRTY! A hasn't committed)`
  );

  db.rollback(txA); // A rolls back — the $0 balance never existed
  db.commit(txB);

  console.log('\n--- READ COMMITTED (dirty read prevented) ---');
  const db2 = new MockDatabase();
  db2.seed('accounts', [{ id: 1, balance: 1000 }]);

  const txC = db2.beginTransaction('READ_COMMITTED');
  db2.write(
    txC, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 0 }),
    'READ_COMMITTED'
  );

  const txD = db2.beginTransaction('READ_COMMITTED');
  const cleanResult = db2.read(txD, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );
  console.log(
    `  [${txD}] Read balance: ${cleanResult[0]?.balance} ` +
    `(correct — reads only committed data)`
  );

  db2.rollback(txC);
  db2.commit(txD);
}

function demonstrateNonRepeatableRead(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Non-Repeatable Read');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('accounts', [{ id: 1, balance: 500 }]);

  console.log('\n--- READ COMMITTED (non-repeatable read occurs) ---');
  const txA = db.beginTransaction('READ_COMMITTED');
  const firstRead = db.read(txA, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );
  console.log(`  [${txA}] First read balance: ${firstRead[0]?.balance}`);

  // Another transaction modifies and commits
  const txB = db.beginTransaction('READ_COMMITTED');
  db.write(
    txB, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 200 }),
    'READ_COMMITTED'
  );
  db.commit(txB);

  const secondRead = db.read(txA, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );
  console.log(
    `  [${txA}] Second read balance: ${secondRead[0]?.balance} ` +
    `(CHANGED! Non-repeatable read)`
  );
  db.commit(txA);

  console.log('\n--- REPEATABLE READ (non-repeatable read prevented) ---');
  const db2 = new MockDatabase();
  db2.seed('accounts', [{ id: 1, balance: 500 }]);

  const txC = db2.beginTransaction('REPEATABLE_READ');
  const firstRead2 = db2.read(txC, 'accounts', 'REPEATABLE_READ',
    (r) => r.id === 1
  );
  console.log(`  [${txC}] First read balance: ${firstRead2[0]?.balance}`);

  const txD = db2.beginTransaction('REPEATABLE_READ');
  db2.write(
    txD, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 200 }),
    'REPEATABLE_READ'
  );
  db2.commit(txD);

  const secondRead2 = db2.read(txC, 'accounts', 'REPEATABLE_READ',
    (r) => r.id === 1
  );
  console.log(
    `  [${txC}] Second read balance: ${secondRead2[0]?.balance} ` +
    `(SAME — snapshot prevents non-repeatable read)`
  );
  db2.commit(txC);
}

function demonstratePhantomRead(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Phantom Read');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('employees', [
    { id: 1, name: 'Alice', dept: 'Engineering' },
    { id: 2, name: 'Bob', dept: 'Engineering' },
  ]);

  console.log('\n--- READ COMMITTED (phantom read occurs) ---');
  const txA = db.beginTransaction('READ_COMMITTED');
  const firstQuery = db.read(txA, 'employees', 'READ_COMMITTED',
    (r) => r.dept === 'Engineering'
  );
  console.log(`  [${txA}] Engineering count: ${firstQuery.length}`);

  const txB = db.beginTransaction('READ_COMMITTED');
  db.insert(txB, 'employees', {
    id: 3, name: 'Charlie', dept: 'Engineering',
  });
  db.commit(txB);

  const secondQuery = db.read(txA, 'employees', 'READ_COMMITTED',
    (r) => r.dept === 'Engineering'
  );
  console.log(
    `  [${txA}] Engineering count: ${secondQuery.length} ` +
    `(PHANTOM — new row appeared)`
  );
  db.commit(txA);
}

function demonstrateLostUpdate(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Lost Update');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('accounts', [{ id: 1, balance: 1000 }]);

  console.log('\n--- READ COMMITTED (lost update occurs) ---');
  const txA = db.beginTransaction('READ_COMMITTED');
  const txB = db.beginTransaction('READ_COMMITTED');

  // Both read the same balance
  const readA = db.read(txA, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );
  const readB = db.read(txB, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );

  console.log(`  [${txA}] Read balance: ${readA[0]?.balance}`);
  console.log(`  [${txB}] Read balance: ${readB[0]?.balance}`);

  // A withdraws 100 (1000 - 100 = 900)
  db.write(
    txA, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 900 }),
    'READ_COMMITTED'
  );
  db.commit(txA);

  // B deposits 200 (1000 + 200 = 1200) — based on stale read!
  db.write(
    txB, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 1200 }),
    'READ_COMMITTED'
  );
  db.commit(txB);

  const finalState = db.getTable('accounts');
  console.log(
    `  Final balance: ${(finalState[0] as Row)?.balance} ` +
    `(should be 1100, A's withdrawal was LOST)`
  );
}

function demonstrateWriteSkew(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Write Skew');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('doctors', [
    { id: 'alice', name: 'Alice', on_call: true, shift: 'night' },
    { id: 'bob', name: 'Bob', on_call: true, shift: 'night' },
    { id: 'carol', name: 'Carol', on_call: false, shift: 'night' },
  ]);

  console.log(
    '\n--- REPEATABLE READ / Snapshot Isolation (write skew occurs) ---'
  );

  const txA = db.beginTransaction('REPEATABLE_READ');
  const txB = db.beginTransaction('REPEATABLE_READ');

  // Both check how many doctors are on call
  const onCallA = db.read(txA, 'doctors', 'REPEATABLE_READ',
    (r) => r.on_call === true && r.shift === 'night'
  );
  console.log(
    `  [${txA}] Doctors on call: ${onCallA.length} ` +
    `(>= 1, safe to go off call)`
  );

  const onCallB = db.read(txB, 'doctors', 'REPEATABLE_READ',
    (r) => r.on_call === true && r.shift === 'night'
  );
  console.log(
    `  [${txB}] Doctors on call: ${onCallB.length} ` +
    `(>= 1, safe to go off call)`
  );

  // Alice goes off call
  db.write(
    txA, 'doctors',
    (r) => r.id === 'alice',
    (r) => ({ ...r, on_call: false }),
    'REPEATABLE_READ'
  );
  db.commit(txA);

  // Bob goes off call — different row, no write-write conflict!
  db.write(
    txB, 'doctors',
    (r) => r.id === 'bob',
    (r) => ({ ...r, on_call: false }),
    'REPEATABLE_READ'
  );
  db.commit(txB);

  const finalState = db.getTable('doctors');
  const onCallFinal = finalState.filter(
    (r) => r.on_call === true && r.shift === 'night'
  );
  console.log(
    `  Doctors on call after both commits: ${onCallFinal.length} ` +
    `(WRITE SKEW — invariant violated!)`
  );
}

function demonstrateReadSkew(): void {
  console.log('\n' + '='.repeat(60));
  console.log('ANOMALY: Read Skew');
  console.log('='.repeat(60));

  const db = new MockDatabase();
  db.seed('accounts', [
    { id: 1, balance: 500 },
    { id: 2, balance: 300 },
  ]);
  // Invariant: sum of balances = 800 (closed system)

  console.log('\n--- READ COMMITTED (read skew occurs) ---');
  const txA = db.beginTransaction('READ_COMMITTED');

  // Read account 1
  const acc1 = db.read(txA, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 1
  );
  console.log(`  [${txA}] Account 1 balance: ${acc1[0]?.balance}`);

  // Concurrent transfer: $100 from account 1 to account 2
  const txB = db.beginTransaction('READ_COMMITTED');
  db.write(
    txB, 'accounts',
    (r) => r.id === 1,
    (r) => ({ ...r, balance: 400 }),
    'READ_COMMITTED'
  );
  db.write(
    txB, 'accounts',
    (r) => r.id === 2,
    (r) => ({ ...r, balance: 400 }),
    'READ_COMMITTED'
  );
  db.commit(txB);

  // Read account 2 — sees post-transfer state
  const acc2 = db.read(txA, 'accounts', 'READ_COMMITTED',
    (r) => r.id === 2
  );
  console.log(`  [${txA}] Account 2 balance: ${acc2[0]?.balance}`);
  console.log(
    `  [${txA}] Total: ${
      (acc1[0]?.balance as number) + (acc2[0]?.balance as number)
    } (should be 800 — READ SKEW!)`
  );

  db.commit(txA);
}

// Run all demonstrations
function main(): void {
  console.log('Transaction Isolation Level Anomaly Demonstrations');
  console.log('='.repeat(60));

  demonstrateDirtyRead();
  demonstrateNonRepeatableRead();
  demonstratePhantomRead();
  demonstrateLostUpdate();
  demonstrateWriteSkew();
  demonstrateReadSkew();

  console.log('\n' + '='.repeat(60));
  console.log('All anomaly demonstrations complete.');
  console.log('='.repeat(60));
}

main();

Performance Comparison: Real Benchmark Data

The following data comes from published benchmarks and production measurements. Your specific performance will vary based on hardware, workload, and contention patterns, but the relative differences are consistent.

PostgreSQL Benchmarks (pgbench, 32 clients, 8 vCPU)

Workload: Standard pgbench (TPC-B-like)
Scale factor: 100 (10M rows)
Duration: 5 minutes per test

Isolation Level        TPS (avg)    Latency (p50)    Latency (p99)    Abort Rate
──────────────────     ─────────    ─────────────    ─────────────    ──────────
READ COMMITTED          8,450        3.2 ms           12.1 ms          0.0%
REPEATABLE READ         7,920        3.4 ms           14.3 ms          1.2%
SERIALIZABLE (SSI)      6,100        4.5 ms           22.7 ms          4.8%

MySQL InnoDB Benchmarks (sysbench, 32 threads, 8 vCPU)

Workload: sysbench oltp_read_write
Table size: 10M rows
Duration: 5 minutes per test

Isolation Level        TPS (avg)    Latency (p50)    Latency (p99)    Deadlocks/s
──────────────────     ─────────    ─────────────    ─────────────    ───────────
READ UNCOMMITTED        9,200        2.8 ms           10.5 ms          0
READ COMMITTED          8,800        3.0 ms           11.2 ms          0
REPEATABLE READ         8,500        3.1 ms           12.8 ms          0.2
SERIALIZABLE            4,200        6.8 ms           45.3 ms          3.5

Key observations:

  1. READ COMMITTED to REPEATABLE READ is a modest drop (~5-10%). The cost is snapshot management.
  2. REPEATABLE READ to SERIALIZABLE is a dramatic drop (25-50%). The cost is dependency tracking (SSI) or lock contention (S2PL).
  3. MySQL's SERIALIZABLE (S2PL) shows much worse p99 latency than PostgreSQL's SERIALIZABLE (SSI) because blocking causes queuing effects.
  4. SSI's abort rate (4.8%) means your application must handle retries — but retries are cheaper than blocking.

Throughput Under Contention

The differences become more dramatic as contention increases:

Contention: % of transactions touching the same 100 rows (hot rows)

           Low Contention (1%)    Medium (10%)    High (50%)
           ────────────────────   ────────────    ──────────
RC (PG):        8,450               7,900           6,200
RR (PG):        7,920               6,800           4,100
SER/SSI (PG):   6,100               4,200           2,400
SER/S2PL (MY):  4,200               1,800             600

S2PL collapses under high contention because transactions
spend most of their time waiting for locks held by other
transactions that are also waiting for locks.

Decision Framework: Choosing the Right Isolation Level

Decision Tree

Start

├─ Can your app tolerate reading uncommitted data?
│  └─ Rarely yes → READ UNCOMMITTED (almost never appropriate)

├─ Can your app tolerate non-repeatable reads within a transaction?
│  ├─ Yes: Are transactions short (single statement)?
│  │  └─ Yes → READ COMMITTED ← Most web applications
│  │
│  └─ No: Do you need protection from write skew?
│     ├─ No → REPEATABLE READ / Snapshot Isolation
│     │      (handles non-repeatable reads, phantoms, lost updates)
│     │
│     └─ Yes: Is write skew a critical invariant violation?
│        ├─ Yes → SERIALIZABLE
│        │       (or explicit locking: SELECT FOR UPDATE)
│        │
│        └─ No → REPEATABLE READ with application-level checks
│               (e.g., optimistic locking with version columns)

Use Case Mapping

Use CaseRecommended LevelWhy
Simple web request (read, compute, respond)READ COMMITTEDEach statement sees latest data; no multi-read consistency needed
REST API with read-then-writeREAD COMMITTED + optimistic lockingVersion column catches stale writes
Financial transfer (debit + credit)SERIALIZABLE or SELECT FOR UPDATEMust prevent lost updates and write skew
Reporting / analytics queryREPEATABLE READNeeds consistent snapshot across long query
Inventory reservationSERIALIZABLE or SELECT FOR UPDATEMust prevent double-booking (write skew)
User registration (unique email)READ COMMITTED + UNIQUE constraintDatabase constraint handles the race
Batch processingREAD COMMITTEDEach statement in the batch sees latest data
Backup / logical replicationREPEATABLE READNeeds consistent snapshot of entire database
Auction / bidding systemSERIALIZABLEBid acceptance depends on current highest bid (write skew risk)

The SELECT FOR UPDATE Escape Hatch

For many applications, REPEATABLE READ plus explicit locking is more practical than SERIALIZABLE:

sql
-- Instead of relying on SERIALIZABLE to detect write skew:
BEGIN ISOLATION LEVEL REPEATABLE READ;

-- Explicitly lock the rows you're making a decision based on:
SELECT COUNT(*) FROM doctors
  WHERE on_call = true AND shift = 'night'
  FOR UPDATE;
-- This acquires exclusive locks on all matching rows.
-- Any concurrent transaction trying to SELECT FOR UPDATE
-- on the same rows will BLOCK until this transaction commits.

-- Now safe to go off call — no other transaction can change
-- the on-call status of these doctors while we hold the locks.
UPDATE doctors SET on_call = false WHERE id = 'alice';
COMMIT;

Trade-offs of SELECT FOR UPDATE vs SERIALIZABLE:

AspectSELECT FOR UPDATESERIALIZABLE
Developer effortMust identify which reads need lockingAutomatic detection
Risk of forgettingHigh — one missed FOR UPDATE = bugNone — database handles it
PerformanceBetter (only locks what you specify)Worse (tracks all reads)
Deadlock riskHigher (explicit locks)Lower (SSI) or same (S2PL)
Error handlingSame (retry on deadlock)Same (retry on serialization failure)

Real-World War Stories

War Story 1: The Double-Charge Bug

Company: E-commerce platform, PostgreSQL, READ COMMITTED (default).

Bug: Customers were occasionally charged twice for the same order. The payment flow was:

sql
-- Step 1: Check if order is already paid
SELECT status FROM orders WHERE id = $1;
-- Returns 'pending'

-- Step 2: Process payment (external API call, takes 2-3 seconds)
-- ... network call ...

-- Step 3: Mark order as paid
UPDATE orders SET status = 'paid' WHERE id = $1 AND status = 'pending';

What happened: Two concurrent requests for the same order both read status = 'pending' in Step 1. Both processed the payment. Both updated the status. The second UPDATE was a no-op (status was already 'paid'), but the payment was already processed.

Root cause: READ COMMITTED allows both transactions to read the same committed state. The UPDATE's WHERE clause (status = 'pending') should have prevented the second update, but the external API call was the real payment — the UPDATE was just a record-keeping step.

Fix: Use SELECT FOR UPDATE in Step 1 to block the second transaction until the first completes:

sql
BEGIN;
SELECT status FROM orders WHERE id = $1 FOR UPDATE;
-- Second transaction BLOCKS here until the first commits
-- ... process payment ...
UPDATE orders SET status = 'paid' WHERE id = $1;
COMMIT;

War Story 2: The Inventory Oversell

Company: Ticketing platform, MySQL InnoDB, REPEATABLE READ (default).

Bug: Concert tickets were oversold — 500 seats sold for a 400-capacity venue.

sql
-- Check available seats
SELECT COUNT(*) FROM bookings WHERE event_id = $1;
-- Returns 399 (1 seat left)

-- Book the seat
INSERT INTO bookings (event_id, user_id, seat) VALUES ($1, $2, 'GA');

What happened: Under REPEATABLE READ with snapshot reads, 100+ concurrent transactions all read COUNT(*) = 399 from their snapshot. All decided there was a seat available. All inserted a booking. The COUNT was stale — it came from the snapshot, not from the current state.

Root cause: Snapshot Isolation does not protect against this pattern. The read (COUNT) and the write (INSERT) operate on different rows, so there is no write-write conflict.

Fix: Use a counter with explicit locking, or use SERIALIZABLE:

sql
-- Fix 1: Explicit lock on a counter row
BEGIN;
SELECT remaining_seats FROM events WHERE id = $1 FOR UPDATE;
-- Returns 1 (blocks concurrent transactions)
UPDATE events SET remaining_seats = remaining_seats - 1
  WHERE id = $1 AND remaining_seats > 0;
-- Returns 1 row affected (or 0 if sold out)
INSERT INTO bookings (event_id, user_id) VALUES ($1, $2);
COMMIT;

-- Fix 2: PostgreSQL SERIALIZABLE
BEGIN ISOLATION LEVEL SERIALIZABLE;
SELECT COUNT(*) FROM bookings WHERE event_id = $1;
INSERT INTO bookings (event_id, user_id) VALUES ($1, $2);
COMMIT;
-- SSI detects the rw-conflict and aborts concurrent transactions

War Story 3: The Broken Backup

Company: SaaS platform, PostgreSQL, READ COMMITTED.

Bug: Database restores from logical backups (pg_dump) occasionally produced inconsistent data. Foreign key violations appeared after restore. Row counts in related tables did not match.

Root cause: pg_dump without --serializable-deferrable takes a READ COMMITTED snapshot. A multi-table dump runs multiple SELECT statements, each seeing a different snapshot. If data is being actively modified during the dump, the backup captures Table A at time T1 and Table B at time T2, where T2 > T1. Rows that were moved between tables during [T1, T2] are either duplicated or missing.

Fix:

bash
# Use serializable snapshot for consistent backup
pg_dump --serializable-deferrable mydb > backup.sql

# This starts a SERIALIZABLE DEFERRABLE transaction,
# which waits for a safe snapshot before beginning,
# then reads everything from that consistent point in time.
# The DEFERRABLE flag means it waits for a "safe" snapshot
# rather than potentially aborting.

War Story 4: The MySQL Locking Read Surprise

Company: Fintech startup, MySQL InnoDB, REPEATABLE READ.

Bug: Account balance calculations were intermittently wrong. The code used a mix of regular SELECTs and SELECT FOR UPDATE:

sql
BEGIN;

-- Regular SELECT (consistent read, from snapshot)
SELECT balance FROM accounts WHERE id = 1;
-- Returns $1000 (from transaction's snapshot)

-- ... some application logic ...

-- SELECT FOR UPDATE (locking read, sees current data!)
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
-- Returns $800 (current committed value — someone else withdrew $200!)

-- Developer expected both reads to return the same value
-- Application logic assumed balance was $1000, but the locked
-- read shows $800. Calculations based on $1000 are now wrong.

Root cause: MySQL InnoDB's REPEATABLE READ gives consistent reads from the snapshot but locking reads from the current state. The developer did not know about this split behavior.

Fix: Use only locking reads (FOR UPDATE) when you need both consistent reads and write locks. Or use PostgreSQL, which does not have this split behavior.

What Each Database Actually Provides

The ANSI isolation level names are a shared vocabulary, but each database implements them differently. Here is what you actually get:

DatabaseREAD UNCOMMITTEDREAD COMMITTEDREPEATABLE READSERIALIZABLE
PostgreSQLUpgraded to RCMVCC, per-statement snapshotSnapshot Isolation (prevents phantoms)SSI (true serializable)
MySQL InnoDBTrue RUMVCC + lockingSI + gap locks (split consistent/locking reads)S2PL (all reads are locking reads)
OracleNot supportedMVCC, per-statement snapshotNot supported (use SERIALIZABLE)Snapshot Isolation (NOT truly serializable!)
SQL ServerTrue RULock-based (S-locks released early)Lock-based (S-locks held to commit)Lock-based S2PL
SQL Server (RCSI)True RUMVCC (row versioning)Lock-basedLock-based S2PL
SQL Server (SI)True RUMVCC or Lock-basedLock-basedSnapshot Isolation (separate SNAPSHOT level)
CockroachDBNot supportedNot supportedNot supportedSSI (only level available)
YugabyteDBNot supportedMVCCSnapshot IsolationSSI (true serializable)

CockroachDB's Approach

CockroachDB only offers SERIALIZABLE isolation. Their philosophy: if you are going to use a distributed database, you should not have to worry about concurrency anomalies. The performance cost of SSI is acceptable when you are already paying the cost of distributed consensus.

Practical Retry Pattern for Serialization Failures

Any application using REPEATABLE READ or SERIALIZABLE must handle transaction aborts due to serialization failures. Here is a production-grade retry pattern:

typescript
import { Pool, PoolClient } from 'pg';

const pool = new Pool({ connectionString: process.env.DATABASE_URL });

interface RetryOptions {
  maxRetries: number;
  baseDelay: number;    // ms
  maxDelay: number;     // ms
  isolationLevel: 'READ COMMITTED' | 'REPEATABLE READ' | 'SERIALIZABLE';
}

const DEFAULT_OPTIONS: RetryOptions = {
  maxRetries: 5,
  baseDelay: 50,
  maxDelay: 5000,
  isolationLevel: 'SERIALIZABLE',
};

// Serialization failure error code in PostgreSQL
const SERIALIZATION_FAILURE = '40001';
const DEADLOCK_DETECTED = '40P01';

async function withSerializableTransaction<T>(
  fn: (client: PoolClient) => Promise<T>,
  options: Partial<RetryOptions> = {}
): Promise<T> {
  const opts = { ...DEFAULT_OPTIONS, ...options };
  let lastError: Error | null = null;

  for (let attempt = 1; attempt <= opts.maxRetries; attempt++) {
    const client = await pool.connect();
    try {
      await client.query(
        `BEGIN ISOLATION LEVEL ${opts.isolationLevel}`
      );
      const result = await fn(client);
      await client.query('COMMIT');
      return result;
    } catch (error: unknown) {
      await client.query('ROLLBACK');
      const pgError = error as { code?: string };

      if (
        pgError.code === SERIALIZATION_FAILURE ||
        pgError.code === DEADLOCK_DETECTED
      ) {
        lastError = error as Error;

        // Exponential backoff with jitter
        const delay = Math.min(
          opts.baseDelay * Math.pow(2, attempt - 1) +
            Math.random() * opts.baseDelay,
          opts.maxDelay
        );

        console.warn(
          `Serialization failure (attempt ${attempt}/${opts.maxRetries}), ` +
          `retrying in ${delay.toFixed(0)}ms...`
        );

        await new Promise((resolve) => setTimeout(resolve, delay));
        continue;
      }

      // Non-retryable error — rethrow
      throw error;
    } finally {
      client.release();
    }
  }

  throw new Error(
    `Transaction failed after ${opts.maxRetries} attempts. ` +
    `Last error: ${lastError?.message}`
  );
}

// Usage example: safe on-call update with serializable isolation
async function goOffCall(doctorId: string): Promise<boolean> {
  return withSerializableTransaction(async (client) => {
    const { rows } = await client.query(
      `SELECT COUNT(*) as count FROM doctors
       WHERE on_call = true AND shift = 'night'`
    );

    const onCallCount = parseInt(rows[0].count, 10);
    if (onCallCount <= 1) {
      // Cannot go off call — would violate the invariant
      return false;
    }

    await client.query(
      `UPDATE doctors SET on_call = false WHERE id = $1`,
      [doctorId]
    );

    return true;
    // If a concurrent transaction also goes off call,
    // PostgreSQL SSI will detect the write skew and abort one.
    // The retry loop will re-execute the function, which will
    // re-read the count and find only 1 doctor on call,
    // returning false.
  });
}

Common Misconceptions

Misconception 1: "REPEATABLE READ prevents all read anomalies"

REPEATABLE READ (as implemented by MVCC databases) prevents non-repeatable reads and phantom reads. But it does NOT prevent write skew. If your transaction reads data, makes a decision, and writes based on that decision, another concurrent transaction can make the same decision and write to different rows, violating the invariant.

Misconception 2: "SERIALIZABLE is too slow for production"

PostgreSQL's SSI-based SERIALIZABLE is much faster than lock-based SERIALIZABLE. In low-to-medium contention workloads, the throughput difference between REPEATABLE READ and SERIALIZABLE is 15-25%. Many production systems run at SERIALIZABLE with no issues. CockroachDB uses SERIALIZABLE exclusively.

Misconception 3: "My ORM handles isolation for me"

ORMs typically use the database's default isolation level (READ COMMITTED in PostgreSQL, REPEATABLE READ in MySQL). They do not automatically choose the right level for your transaction. They also do not retry on serialization failures. You must configure isolation levels and retry logic yourself.

Misconception 4: "Higher isolation = more locks = more deadlocks"

This is true for lock-based SERIALIZABLE (MySQL, SQL Server) but NOT for SSI (PostgreSQL). SSI does not use blocking locks for reads — it uses lightweight predicate locks that only record dependencies. SSI transactions never deadlock (though they can be aborted due to detected conflicts).

Misconception 5: "Oracle SERIALIZABLE is the same as PostgreSQL SERIALIZABLE"

Oracle's SERIALIZABLE is Snapshot Isolation — it does NOT prevent write skew. PostgreSQL's SERIALIZABLE is Serializable Snapshot Isolation — it DOES prevent write skew. They share the same name but provide fundamentally different guarantees.

Summary Table: Quick Reference

┌────────────────────────────────────────────────────────────────────────────┐
│                    ISOLATION LEVEL QUICK REFERENCE                         │
├──────────────┬──────────┬───────────┬─────────────┬───────────────────────┤
│   Anomaly    │ READ     │ READ      │ REPEATABLE  │ SERIALIZABLE          │
│              │ UNCOMM.  │ COMMITTED │ READ        │                       │
├──────────────┼──────────┼───────────┼─────────────┼───────────────────────┤
│ Dirty Read   │ Possible │ No        │ No          │ No                    │
│ Non-Repeat   │ Possible │ Possible  │ No          │ No                    │
│ Phantom      │ Possible │ Possible  │ No (MVCC*)  │ No                    │
│ Lost Update  │ Possible │ Possible  │ No (PG/MY)  │ No                    │
│ Read Skew    │ Possible │ Possible  │ No          │ No                    │
│ Write Skew   │ Possible │ Possible  │ POSSIBLE    │ No                    │
├──────────────┼──────────┼───────────┼─────────────┼───────────────────────┤
│ PG Default   │          │ ★         │             │                       │
│ MySQL Default│          │           │ ★           │                       │
│ Oracle Def.  │          │ ★         │             │                       │
├──────────────┼──────────┼───────────┼─────────────┼───────────────────────┤
│ Implementation│ (none)  │ Per-stmt  │ Per-txn     │ SSI (PG)              │
│              │          │ snapshot  │ snapshot    │ S2PL (MySQL)          │
├──────────────┼──────────┼───────────┼─────────────┼───────────────────────┤
│ Throughput   │ Highest  │ High      │ Medium-High │ Medium (SSI)          │
│              │          │           │             │ Low (S2PL)            │
└──────────────┴──────────┴───────────┴─────────────┴───────────────────────┘

 * MVCC-based REPEATABLE READ prevents phantoms for consistent reads.
   ANSI spec says phantoms are allowed at this level.
   MySQL gap locks prevent phantoms for locking reads.

 ★ = Default isolation level for this database

"What I cannot create, I do not understand." — Richard Feynman