Database Indexing Guide: How to Properly Index Your Database for Performance

Database indexes are the single most impactful tool for query performance optimization. A poorly indexed database can turn a query that should complete in milliseconds into one that takes minutes. But over-indexing creates write performance penalties and storage overhead. This guide teaches you when, how, and what to index — with practical examples in PostgreSQL and MySQL.

How Indexes Work (Conceptually)

Without an index, a database must perform a full table scan — examining every row to find matches. With a B-tree index (the most common type), the database maintains a sorted tree structure that allows it to find matching rows in O(log n) time instead of O(n). For a table with 10 million rows, that's the difference between examining 10 million rows and examining ~23.

The trade-off: indexes must be updated every time data changes (INSERT, UPDATE, DELETE). This adds write overhead and storage cost. The goal is to index columns that are frequently read and infrequently written.

Types of Database Indexes

B-Tree Index

Use for: equality, range, ORDER BY, LIKE 'prefix%'

Default type

The most common and versatile index type. Works for =, <, >, BETWEEN, IN, and prefix string matching. PostgreSQL and MySQL both use B-tree by default when you CREATE INDEX.

Hash Index

Use for: equality only (=)

Faster for =

Faster than B-tree for exact equality lookups but useless for range queries or ordering. Rarely used because B-tree is nearly as fast for equality and handles range queries too.

GIN Index (PostgreSQL)

Use for: JSONB, full-text search, arrays

Best for JSON/text

Generalized Inverted Index — essential for JSONB queries, full-text search (tsvector), and array containment queries. Significantly faster than B-tree for these use cases.

GiST Index (PostgreSQL)

Use for: geometric data, full-text, ranges

Specialized

Generalized Search Tree — used for PostGIS geographic queries, range types, and full-text search. More versatile than GIN but slower for pure text search.

Partial Index

Use for: frequently queried subsets

Huge space savings

Index only the rows matching a WHERE condition. Example: index only active users, not deleted ones. Dramatically reduces index size and improves read/write performance for the filtered set.

Composite Index

Use for: multi-column WHERE clauses

Order matters

Index on multiple columns. A composite index (a, b) can satisfy queries on (a), (a, b), but NOT on (b) alone. Place the highest-cardinality column or most-filtered column first.

Covering Index

Use for: queries reading only indexed columns

Eliminates table lookup

An index that includes all columns needed by a query — the database never needs to hit the main table. PostgreSQL supports INCLUDE columns in indexes (pg 11+). Extremely fast for read-heavy workloads.

Full-Text Index

Use for: text search queries

Search-specific

MySQL FULLTEXT or PostgreSQL tsvector+tsquery. Far better than LIKE '%term%' for text search. For serious full-text search, consider Elasticsearch or PostgreSQL's built-in FTS capabilities.

Practical Indexing Examples

Basic Index Creation

-- Standard B-tree index
CREATE INDEX idx_users_email ON users (email);

-- Unique index (also enforces uniqueness constraint)
CREATE UNIQUE INDEX idx_users_email_unique ON users (email);

-- Composite index (user_id + created_at for time-range queries per user)
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at DESC);

-- Partial index (only active sessions)
CREATE INDEX idx_sessions_active ON sessions (user_id, last_seen)
WHERE expired_at IS NULL;

-- Covering index with INCLUDE (pg 11+)
CREATE INDEX idx_orders_covering ON orders (user_id)
INCLUDE (total_amount, status, created_at);

-- GIN index for JSONB
CREATE INDEX idx_products_meta ON products USING GIN (metadata);

Finding Missing Indexes with EXPLAIN ANALYZE

-- Always use EXPLAIN ANALYZE to see actual execution plans
EXPLAIN ANALYZE
SELECT * FROM orders
WHERE user_id = 12345
  AND created_at > '2026-01-01'
ORDER BY created_at DESC
LIMIT 20;

-- Look for "Seq Scan" (bad) vs "Index Scan" (good) vs "Index Only Scan" (best)
-- High rows/loops ratio indicates a missing index
-- "cost=0.00..XXX" — the second number is the estimated cost; higher = slower

Finding Unused Indexes (PostgreSQL)

-- Find unused indexes (zero or very few scans)
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read,
       pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan < 10
ORDER BY pg_relation_size(indexrelid) DESC;

-- Unused indexes waste storage and slow down writes — DROP them!

The Index Cardinality Rule

High-cardinality columns (many unique values) benefit most from indexes. A boolean column (true/false) has low cardinality — an index on it is nearly useless since the database will need to read ~50% of rows anyway. A UUID or email column has very high cardinality — indexes shine here.

Over-Indexing Warning

Every index adds overhead to INSERT, UPDATE, and DELETE operations. A table with 15 indexes on a write-heavy workload can perform 3–5x slower on writes than a properly indexed table with 3–5 indexes. Regularly audit your indexes and drop any that aren't being used (see the unused indexes query above). In PostgreSQL, you can use pg_stat_user_indexes to track index usage since the last stats reset.

Database Indexing: How to Properly Index for Maximum Performance