Database Migrations
Every production database needs versioned, repeatable schema changes. Hibernate's ddl-auto is for prototyping — it cannot rename columns, migrate data, or handle the dozens of edge cases that real schema evolution requires. Database migration tools bring version control discipline to your schema: every change is a numbered file, applied once, tracked in a metadata table, and reversible.
This page covers Flyway (the most popular choice in Spring Boot) and Liquibase, along with zero-downtime migration patterns that matter when your database serves millions of requests.
Flyway Setup
Dependencies
<!-- pom.xml -->
<dependency>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-core</artifactId>
</dependency>
<!-- For PostgreSQL-specific features -->
<dependency>
<groupId>org.flywaydb</groupId>
<artifactId>flyway-database-postgresql</artifactId>
</dependency>Configuration
# application.yml
spring:
flyway:
enabled: true
locations: classpath:db/migration
baseline-on-migrate: true # For existing databases without Flyway history
baseline-version: '0'
validate-on-migrate: true # Verify checksums of applied migrations
out-of-order: false # Don't allow out-of-order migrations in prod
clean-disabled: true # NEVER allow clean in production!
table: flyway_schema_history
schemas:
- public
jpa:
hibernate:
ddl-auto: validate # Validate schema matches entities AFTER migration
datasource:
url: jdbc:postgresql://localhost:5432/myapp
username: ${DB_USERNAME:postgres}
password: ${DB_PASSWORD:postgres}Never set clean-disabled to false in production
flyway.clean-disabled: false allows flyway:clean to drop ALL objects in the schema. A single accidental command or misconfigured CI pipeline can destroy your entire database. Always keep this true in production.
Migration File Naming
src/main/resources/db/migration/
├── V1__create_users_table.sql
├── V2__create_products_table.sql
├── V3__create_orders_table.sql
├── V4__add_email_index_to_users.sql
├── V5__add_category_to_products.sql
├── V6__create_reviews_table.sql
├── V7__add_full_text_search_index.sql
├── R__seed_categories.sql # Repeatable migration
└── R__create_views.sql # Repeatable migrationNaming convention: V{version}__{description}.sql
| Prefix | Meaning | Behavior |
|---|---|---|
V | Versioned | Runs once, in order. Checksum is tracked. |
R | Repeatable | Runs every time the checksum changes. Good for views, functions, seed data. |
U | Undo (paid) | Reverses a versioned migration. Flyway Teams only. |
Migration Examples
-- V1__create_users_table.sql
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) NOT NULL,
first_name VARCHAR(100) NOT NULL,
last_name VARCHAR(100) NOT NULL,
password_hash VARCHAR(255) NOT NULL,
role VARCHAR(50) NOT NULL DEFAULT 'USER',
active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
CONSTRAINT uk_users_email UNIQUE (email)
);
CREATE INDEX idx_users_email ON users (email);
CREATE INDEX idx_users_role ON users (role) WHERE active = TRUE;-- V2__create_products_table.sql
CREATE TABLE products (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(200) NOT NULL,
description TEXT,
price NUMERIC(12, 2) NOT NULL CHECK (price >= 0),
sku VARCHAR(20) NOT NULL,
category VARCHAR(50) NOT NULL,
stock_quantity INTEGER NOT NULL DEFAULT 0 CHECK (stock_quantity >= 0),
active BOOLEAN NOT NULL DEFAULT TRUE,
version BIGINT NOT NULL DEFAULT 0,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
created_by VARCHAR(255),
updated_by VARCHAR(255),
CONSTRAINT uk_products_sku UNIQUE (sku)
);
CREATE INDEX idx_products_category ON products (category);
CREATE INDEX idx_products_price ON products (price);
CREATE INDEX idx_products_active ON products (active) WHERE active = TRUE;-- V3__create_orders_table.sql
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
customer_id UUID NOT NULL REFERENCES users(id),
status VARCHAR(50) NOT NULL DEFAULT 'PENDING',
total NUMERIC(12, 2) NOT NULL,
notes TEXT,
created_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW()
);
CREATE TABLE order_items (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
order_id UUID NOT NULL REFERENCES orders(id) ON DELETE CASCADE,
product_id UUID NOT NULL REFERENCES products(id),
quantity INTEGER NOT NULL CHECK (quantity > 0),
unit_price NUMERIC(12, 2) NOT NULL,
subtotal NUMERIC(12, 2) NOT NULL
);
CREATE INDEX idx_orders_customer ON orders (customer_id);
CREATE INDEX idx_orders_status ON orders (status);
CREATE INDEX idx_orders_created ON orders (created_at);
CREATE INDEX idx_order_items_order ON order_items (order_id);
CREATE INDEX idx_order_items_product ON order_items (product_id);-- V4__add_full_text_search.sql
-- PostgreSQL full-text search index
ALTER TABLE products ADD COLUMN search_vector TSVECTOR;
CREATE INDEX idx_products_search ON products USING GIN (search_vector);
-- Trigger to keep search_vector updated
CREATE OR REPLACE FUNCTION products_search_trigger()
RETURNS TRIGGER AS $$
BEGIN
NEW.search_vector := to_tsvector('english',
COALESCE(NEW.name, '') || ' ' || COALESCE(NEW.description, ''));
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER trg_products_search
BEFORE INSERT OR UPDATE OF name, description ON products
FOR EACH ROW
EXECUTE FUNCTION products_search_trigger();
-- Backfill existing rows
UPDATE products SET search_vector = to_tsvector('english',
COALESCE(name, '') || ' ' || COALESCE(description, ''));-- R__seed_categories.sql (Repeatable)
-- Runs every time this file changes
INSERT INTO categories (name, slug, sort_order)
VALUES
('Electronics', 'electronics', 1),
('Clothing', 'clothing', 2),
('Books', 'books', 3),
('Home & Garden', 'home-garden', 4),
('Sports', 'sports', 5)
ON CONFLICT (slug) DO UPDATE
SET name = EXCLUDED.name,
sort_order = EXCLUDED.sort_order;Java-Based Migrations
For complex data migrations that need application logic:
package db.migration;
import org.flywaydb.core.api.migration.BaseJavaMigration;
import org.flywaydb.core.api.migration.Context;
import java.sql.*;
/**
* V5: Migrate user full names from single column to first/last name.
* File: src/main/java/db/migration/V5__split_user_names.java
*/
public class V5__split_user_names extends BaseJavaMigration {
@Override
public void migrate(Context context) throws Exception {
Connection conn = context.getConnection();
// Add new columns
try (Statement stmt = conn.createStatement()) {
stmt.execute("ALTER TABLE users ADD COLUMN IF NOT EXISTS first_name VARCHAR(100)");
stmt.execute("ALTER TABLE users ADD COLUMN IF NOT EXISTS last_name VARCHAR(100)");
}
// Migrate data
try (Statement select = conn.createStatement();
PreparedStatement update = conn.prepareStatement(
"UPDATE users SET first_name = ?, last_name = ? WHERE id = ?")) {
ResultSet rs = select.executeQuery("SELECT id, full_name FROM users");
int batch = 0;
while (rs.next()) {
UUID id = (UUID) rs.getObject("id");
String fullName = rs.getString("full_name");
String[] parts = splitName(fullName);
update.setString(1, parts[0]);
update.setString(2, parts[1]);
update.setObject(3, id);
update.addBatch();
if (++batch % 1000 == 0) {
update.executeBatch();
}
}
update.executeBatch();
}
// Drop old column
try (Statement stmt = conn.createStatement()) {
stmt.execute("ALTER TABLE users DROP COLUMN full_name");
stmt.execute("ALTER TABLE users ALTER COLUMN first_name SET NOT NULL");
stmt.execute("ALTER TABLE users ALTER COLUMN last_name SET NOT NULL");
}
}
private String[] splitName(String fullName) {
if (fullName == null || fullName.isBlank()) {
return new String[]{"Unknown", "Unknown"};
}
int lastSpace = fullName.lastIndexOf(' ');
if (lastSpace < 0) {
return new String[]{fullName, ""};
}
return new String[]{
fullName.substring(0, lastSpace),
fullName.substring(lastSpace + 1)
};
}
}Zero-Downtime Migrations
When your database serves live traffic, you cannot lock tables or make breaking changes. Every migration must be backward-compatible.
Pattern: Expand-Contract Migration
Example: Renaming a column from name to title
-- Migration 1: Add new column (backward compatible)
-- V10__add_title_column.sql
ALTER TABLE products ADD COLUMN title VARCHAR(200);
-- Migration 2: Backfill (can run in background)
-- V11__backfill_title.sql
UPDATE products SET title = name WHERE title IS NULL;
-- After deploying code that writes to both columns and reads from 'title':
-- Migration 3: Make title NOT NULL, drop name (breaking for old code)
-- V12__drop_name_column.sql
ALTER TABLE products ALTER COLUMN title SET NOT NULL;
ALTER TABLE products DROP COLUMN name;Safe vs. Unsafe Operations
| Operation | Safe? | Notes |
|---|---|---|
| ADD COLUMN (nullable) | Safe | No table lock |
| ADD COLUMN (with DEFAULT) | Safe (PG 11+) | PG 11+ does not rewrite table |
| DROP COLUMN | Unsafe | Old code may reference it |
| RENAME COLUMN | Unsafe | Old code breaks immediately |
| ADD INDEX CONCURRENTLY | Safe | Use CONCURRENTLY keyword |
| ADD NOT NULL constraint | Risky | Requires full table scan |
| ALTER COLUMN TYPE | Unsafe | May lock table, rewrite data |
Always use CREATE INDEX CONCURRENTLY
Standard CREATE INDEX locks the table for writes. On a 100M row table, this can take minutes. Use CREATE INDEX CONCURRENTLY in PostgreSQL — it builds the index without blocking writes. Note: this cannot run inside a transaction, so use Flyway's non-transactional mode.
-- V15__add_index_concurrently.sql
-- Flyway: disable transaction for CONCURRENTLY
-- flyway:executeInTransaction=false
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_orders_created_status
ON orders (created_at, status);Liquibase Comparison
Liquibase is the main alternative to Flyway. It uses XML/YAML/JSON changelogs instead of raw SQL:
<!-- db/changelog/db.changelog-master.xml -->
<?xml version="1.0" encoding="UTF-8"?>
<databaseChangeLog
xmlns="http://www.liquibase.org/xml/ns/dbchangelog"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog
http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-latest.xsd">
<changeSet id="1" author="dev">
<createTable tableName="users">
<column name="id" type="UUID" defaultValueComputed="gen_random_uuid()">
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="email" type="VARCHAR(255)">
<constraints nullable="false" unique="true"/>
</column>
<column name="first_name" type="VARCHAR(100)">
<constraints nullable="false"/>
</column>
<column name="created_at" type="TIMESTAMP WITH TIME ZONE"
defaultValueComputed="NOW()">
<constraints nullable="false"/>
</column>
</createTable>
</changeSet>
<changeSet id="2" author="dev">
<addColumn tableName="users">
<column name="phone" type="VARCHAR(20)"/>
</addColumn>
<rollback>
<dropColumn tableName="users" columnName="phone"/>
</rollback>
</changeSet>
</databaseChangeLog>Flyway vs. Liquibase
| Feature | Flyway | Liquibase |
|---|---|---|
| Migration format | Raw SQL (native) | XML/YAML/JSON/SQL |
| Database portability | Write SQL per DB | Abstraction layer |
| Rollback | Manual (free), auto (paid) | Built-in rollback |
| Learning curve | Minimal | Moderate |
| Diff generation | No | Yes |
| Conditional execution | Limited | Preconditions |
| Spring Boot support | Auto-configured | Auto-configured |
Choose Flyway for Spring Boot
Unless you need database portability or built-in rollback, Flyway is the better choice for Spring Boot projects. Raw SQL migrations are easier to write, review, and debug. You already know SQL — you do not need an abstraction on top of it.
Multi-Tenant Migrations
For SaaS applications with per-tenant schemas:
@Configuration
public class FlywayMultiTenantConfig {
@Bean
public Flyway flyway(DataSource dataSource, TenantRegistry tenantRegistry) {
// Migrate the shared schema first
Flyway sharedFlyway = Flyway.configure()
.dataSource(dataSource)
.locations("classpath:db/migration/shared")
.schemas("shared")
.load();
sharedFlyway.migrate();
// Migrate each tenant schema
for (String tenantId : tenantRegistry.getAllTenantIds()) {
Flyway tenantFlyway = Flyway.configure()
.dataSource(dataSource)
.locations("classpath:db/migration/tenant")
.schemas("tenant_" + tenantId)
.load();
tenantFlyway.migrate();
}
return sharedFlyway;
}
}Testing Migrations
@SpringBootTest
@Testcontainers
class FlywayMigrationTest {
@Container
static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("postgres:16")
.withDatabaseName("testdb");
@DynamicPropertySource
static void configureProperties(DynamicPropertyRegistry registry) {
registry.add("spring.datasource.url", postgres::getJdbcUrl);
registry.add("spring.datasource.username", postgres::getUsername);
registry.add("spring.datasource.password", postgres::getPassword);
}
@Autowired
private Flyway flyway;
@Test
void allMigrationsApplyCleanly() {
// Flyway runs automatically on startup
// If we get here without exception, all migrations applied successfully
MigrationInfoService info = flyway.info();
assertThat(info.applied()).isNotEmpty();
assertThat(info.pending()).isEmpty();
// Verify no failed migrations
for (MigrationInfo migration : info.all()) {
assertThat(migration.getState())
.isNotIn(MigrationState.FAILED, MigrationState.UNDONE);
}
}
@Test
void entityMappingsMatchSchema() {
// JPA validate mode checks this at startup
// If the test context loads, entities match the schema
}
}Further Reading
- Spring Data JPA — Entity mapping that works with your migrations
- Hibernate Performance Tuning — Indexes and query optimization
- Docker & Deployment — Running migrations in containers
- Testing — Testing with Testcontainers
Common Pitfalls
Pitfall 1: Using Hibernate ddl-auto in production
Setting ddl-auto to create, create-drop, or update in production can drop tables, create incorrect schemas, or apply irreversible changes without tracking. Fix: Always use ddl-auto: validate in production (verifies schema matches entities) and manage all schema changes through Flyway or Liquibase migrations.
Pitfall 2: Editing already-applied migration files
Modifying a migration that has been applied to any environment causes checksum validation failures and blocks application startup. Fix: Never edit applied migrations. Create a new migration for any changes. If you must fix a checksum mismatch in development, use flyway:repair but never in production.
Pitfall 3: Running CREATE INDEX without CONCURRENTLY
Standard CREATE INDEX locks the table for writes. On large tables with production traffic, this can block writes for minutes. Fix: Use CREATE INDEX CONCURRENTLY in PostgreSQL. Note this cannot run inside a transaction, so add -- flyway:executeInTransaction=false to the migration file.
Pitfall 4: Making breaking schema changes in a single migration
Renaming a column or dropping a column while old application versions are still running causes immediate failures. Fix: Use the expand-contract pattern: (1) Add new column. (2) Deploy code that writes to both. (3) Backfill data. (4) Deploy code that reads from new column. (5) Drop old column in a later migration.
Pitfall 5: Not testing migrations against a real database
Running migrations only against H2 in tests misses PostgreSQL-specific syntax, data type differences, and constraint behaviors. Fix: Use Testcontainers with a real PostgreSQL (or your production database) image in integration tests to verify all migrations apply cleanly.
Pitfall 6: Setting clean-disabled to false in production
flyway.clean-disabled: false allows flyway:clean to drop ALL objects in the schema. One accidental command or CI misconfiguration destroys the entire database. Fix: Always set flyway.clean-disabled: true in production environments. Use environment-specific configuration to ensure this is never overridden.
Interview Questions
Q1: What is the difference between Flyway and Hibernate ddl-auto?
Answer
Hibernate ddl-auto automatically generates DDL from entity mappings at startup. It is useful for prototyping but dangerous for production because it cannot rename columns, migrate data, or handle complex schema evolution. Flyway (or Liquibase) applies versioned, hand-written SQL migration scripts tracked in a metadata table. Each migration runs exactly once, checksums are verified, and the migration history provides a complete audit trail. Use ddl-auto: validate with Flyway to ensure entities match the migrated schema.
Q2: How do you handle zero-downtime database migrations?
Answer
Use the expand-contract pattern. In the expand phase, add new columns or tables (backward-compatible changes). Deploy the new application version that writes to both old and new structures. Backfill existing data. In the contract phase, after all instances use the new structure, drop the old columns. Key rules: never rename columns directly, never drop columns that running code depends on, use CREATE INDEX CONCURRENTLY to avoid table locks, and add NOT NULL constraints only after backfilling data.
Q3: What is the difference between versioned (V) and repeatable (R) migrations in Flyway?
Answer
Versioned migrations (prefix V) run exactly once, in version order, and their checksums are tracked. They are used for schema changes like creating tables, adding columns, and data migrations. Repeatable migrations (prefix R) run every time their checksum changes. They are used for objects that can be recreated: views, functions, stored procedures, and seed data (using ON CONFLICT DO UPDATE). Repeatable migrations always run after all versioned migrations.
Q4: How do you handle database migrations in a multi-tenant SaaS application?
Answer
The approach depends on the tenancy strategy. For discriminator-column tenancy, a single set of migrations runs on the shared schema. For schema-per-tenant, iterate over all tenant schemas and apply tenant-specific migrations to each using Flyway's schema configuration. For database-per-tenant, each tenant database runs its own Flyway instance. In all cases, shared infrastructure migrations (the tenant registry, billing tables) run on a shared schema before tenant-specific migrations.
Q5: What are Java-based migrations in Flyway and when should you use them?
Answer
Java-based migrations extend BaseJavaMigration and implement the migrate(Context) method with raw JDBC. They are named like V5__split_user_names.java and placed in the db.migration package. Use them when a migration requires application logic (parsing, transforming data, calling external APIs), conditional execution based on existing data, complex data transformations that are painful in raw SQL, or batch processing with memory management (flush/clear patterns). For simple schema changes, SQL migrations are simpler and preferred.