Database Migration Guide: MySQL to PostgreSQL Step-by-Step

Migrating from MySQL to PostgreSQL is one of the most common database transitions in modern software development — driven by PostgreSQL's superior handling of complex queries, better JSON support, more advanced indexing, and stronger ACID compliance. This guide covers the complete migration process from initial planning to production cutover, with real-world considerations and zero-downtime strategies.

Case Study: E-commerce Database Migration

MySQL 5.7 → PostgreSQL 15 500GB data Zero downtime 2.1M customer records

A Texas-based retail company with 2.1M+ customer records needed to migrate from MySQL 5.7 to PostgreSQL 15 with zero downtime. Their legacy MySQL setup was struggling with complex reporting queries that took 45+ seconds. After migration to PostgreSQL with proper indexing, the same queries ran in under 2 seconds. The migration took 3 weeks of planning and a 4-hour maintenance window. Here's how they did it.

Phase 1: Pre-Migration Assessment

Before writing a single migration command, conduct a thorough assessment of your existing MySQL database:

Step 1.1 — Audit Your Schema

Identify MySQL-specific features that don't have direct PostgreSQL equivalents: AUTO_INCREMENT (PostgreSQL uses SERIAL or IDENTITY), ENUM types (PostgreSQL has native ENUM but with different syntax), unsigned integers (PostgreSQL has no unsigned integers — use CHECK constraints instead), and MySQL-specific functions.

-- Audit MySQL tables and their sizes
SELECT table_name, table_rows, data_length,
       (data_length + index_length) / 1024 / 1024 AS size_mb
FROM information_schema.tables
WHERE table_schema = 'your_database'
ORDER BY size_mb DESC;

Step 1.2 — Identify Incompatible Data Types

Common MySQL → PostgreSQL data type conversions: TINYINT(1) → BOOLEAN, DATETIME → TIMESTAMP, TEXT → TEXT (same), JSON → JSONB (better in Postgres), MEDIUMTEXT/LONGTEXT → TEXT, DOUBLE → DOUBLE PRECISION.

Step 1.3 — Test Application Query Compatibility

Run pgloader or a custom script against a sample of your application queries to identify incompatible SQL syntax. MySQL-specific syntax that needs updating: backtick identifiers (PostgreSQL uses double quotes), LIMIT N,M (PostgreSQL uses LIMIT N OFFSET M), GROUP BY rules (PostgreSQL is stricter about non-aggregated columns).

Phase 2: Setting Up the Target Database

-- Create PostgreSQL database
createdb --encoding=UTF8 --locale=en_US.UTF-8 your_database_pg

-- Install pgloader (Debian/Ubuntu)
apt-get install pgloader

-- Basic pgloader configuration file
LOAD DATABASE
  FROM mysql://user:pass@localhost/source_db
  INTO postgresql://user:pass@localhost/target_db
  WITH include drop, create tables, create indexes,
       reset sequences, foreign keys
  SET work_mem to '128MB', maintenance_work_mem to '512MB'
  CAST type tinyint to boolean drop typemod;

Phase 3: Schema Conversion

Manual schema review is essential even with automated tools. pgloader handles most conversions automatically, but you need to review:

-- MySQL AUTO_INCREMENT equivalent in PostgreSQL
-- MySQL:
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    email VARCHAR(255) NOT NULL
);

-- PostgreSQL (modern syntax):
CREATE TABLE users (
    id INTEGER GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
    email VARCHAR(255) NOT NULL
);

-- PostgreSQL JSONB instead of MySQL JSON (better performance):
ALTER TABLE orders ALTER COLUMN metadata TYPE JSONB
    USING metadata::jsonb;

Phase 4: Data Migration

For large databases (100GB+), a direct migration approach risks extended downtime. Use this dual-write strategy for zero-downtime migration:

Phase 4A: Run pgloader to copy all historical data to PostgreSQL (this can run for hours; the application stays on MySQL).
Phase 4B: Implement dual-write — your application writes to both MySQL and PostgreSQL simultaneously. Reads still come from MySQL.
Phase 4C: Verify data consistency — compare row counts, checksums, and run your test suite against PostgreSQL.
Phase 4D: Switch reads to PostgreSQL while dual-write continues.
Phase 4E: Monitor for issues for 24–72 hours.
Phase 4F: Stop MySQL writes, PostgreSQL is now primary.

Dual-Write Implementation Tip

Implement dual-write at the data layer (not application layer) using a proxy like PgBouncer or a custom database adapter. This keeps application code clean. Alternatively, use triggers in MySQL to replicate changes via Debezium CDC (Change Data Capture) → Kafka → PostgreSQL.

Phase 5: Application Query Updates

-- MySQL-specific syntax to fix for PostgreSQL:

-- 1. Backtick identifiers → double quotes
-- MySQL:  SELECT `user_id` FROM `users`
-- Postgres: SELECT user_id FROM users  (or "user_id" if reserved keyword)

-- 2. LIMIT with offset
-- MySQL:  LIMIT 20, 10  (skip 20, take 10)
-- Postgres: LIMIT 10 OFFSET 20

-- 3. GROUP BY strictness
-- MySQL allows:  SELECT name, COUNT(*) FROM t GROUP BY id
-- Postgres requires all non-agg columns in GROUP BY:
-- SELECT name, COUNT(*) FROM t GROUP BY id, name

-- 4. String concatenation
-- MySQL: CONCAT(first, ' ', last)
-- Postgres: first || ' ' || last  (or CONCAT works too)

-- 5. Date functions
-- MySQL: NOW(), DATE_FORMAT(col, '%Y-%m')
-- Postgres: NOW(), TO_CHAR(col, 'YYYY-MM')

Phase 6: Testing Strategy

Never skip comprehensive testing before cutover:

Data integrity checks: Row counts on every table, checksum validation on key columns, foreign key validation
Performance benchmarks: Run your 20 most-executed queries on both databases and compare execution times
Application integration tests: Run your full test suite against the PostgreSQL instance
Load testing: Simulate peak traffic against PostgreSQL before cutover

Common Migration Pitfalls

MySQL's case-insensitive string comparison vs PostgreSQL's case-sensitive — check all LIKE queries and WHERE clauses
MySQL NULL handling in certain aggregate functions differs from PostgreSQL
MySQL's GROUP BY with ONLY_FULL_GROUP_BY disabled allows queries that PostgreSQL will reject
Timezone handling: MySQL often stores timestamps in server timezone; PostgreSQL is stricter about timezone-aware timestamps

Phase 7: Post-Migration Optimization

After migration, take advantage of PostgreSQL features you couldn't use in MySQL:

-- Partial indexes (not available in MySQL)
CREATE INDEX idx_active_users ON users (email)
WHERE status = 'active';  -- Only indexes active users

-- JSONB queries with GIN index (far better than MySQL JSON)
CREATE INDEX idx_metadata_gin ON orders USING GIN (metadata);
SELECT * FROM orders WHERE metadata @> '{"status": "shipped"}';

-- CTEs (Common Table Expressions) — far more powerful in Postgres
WITH monthly_revenue AS (
    SELECT DATE_TRUNC('month', created_at) AS month,
           SUM(amount) AS total
    FROM orders GROUP BY 1
)
SELECT month, total, total - LAG(total) OVER (ORDER BY month) AS growth
FROM monthly_revenue;

MySQL to PostgreSQL Migration: Complete Step-by-Step Guide