PostgreSQL EXPLAIN ANALYZE: Read Plans & Fix Slow Queries

TL;DR

Find the worst queries with pg_stat_statements, run them under EXPLAIN (ANALYZE, BUFFERS), and find the plan node that owns the wall-clock time. Ninety percent of slow queries are fixed by one index, one query rewrite, or a fresh ANALYZE.

What you’ll learn

The difference between EXPLAIN and EXPLAIN ANALYZE
How to find the queries that actually matter with pg_stat_statements
How to read a plan without memorizing every node type
The three fixes: indexes, query shape, and statistics
How to read a real plan line by line

Slow database queries rarely announce themselves. The app just feels sluggish, one endpoint times out under load, and everyone suspects the framework, the network, or the phase of the moon. PostgreSQL will tell you precisely what's slow and why - if you ask it correctly. This is the workflow we use: find the offenders with pg_stat_statements, understand them with EXPLAIN ANALYZE, and fix them with one of three well-aimed changes.

In a real plan, one node usually owns nearly all the time - find it, then fix that node.

EXPLAIN ANALYZE is the single most useful diagnostic command in Postgres. Plain EXPLAIN prints the plan the query planner intends to run, with cost estimates. Add ANALYZE and Postgres actually executes the query and reports the real timing and row count at every step of the plan - so you can see exactly where the wall-clock time goes instead of guessing. (Because it really runs the query, use it on a replica or inside a rolled-back transaction when the statement writes.) The rest of this post is how to read that output and act on it.

Step 1: Find the queries that actually matter

Don't guess from application logs. The pg_stat_statements extension aggregates every query's execution count and total time. Enable it once, then ask for the top offenders by total time - a metric that automatically balances 'very slow' against 'very frequent':

sql

CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

SELECT round(total_exec_time) AS total_ms,
       calls,
       round(mean_exec_time, 1) AS mean_ms,
       left(query, 90) AS query
FROM pg_stat_statements
ORDER BY total_exec_time DESC
LIMIT 10;

The output almost always surprises. The 'slow report query' everyone blames is frequently nowhere near the top; the real cost is a mid-speed query on a hot path, or an N+1 pattern where the ORM fires the same 5 ms query five hundred times per request. Fix in order of total_ms, top to bottom.

Step 2: Read the plan, not the tea leaves

Take the worst query and run it under EXPLAIN ANALYZE with buffers enabled:

sql

EXPLAIN (ANALYZE, BUFFERS)
SELECT o.id, o.total, c.email
FROM orders o
JOIN customers c ON c.id = o.customer_id
WHERE o.created_at > now() - interval '7 days'
  AND o.status = 'paid';

You don't need to understand every node type. Scan for three signals, in priority order:

Seq Scan on a large table. PostgreSQL read every row because no usable index existed. If the filter is selective (returns a small fraction of rows), this is your fix - add the index.
Estimated vs actual rows wildly different. 'rows=100' estimated but 'actual rows=2,000,000' means the planner is flying blind on stale statistics. Run ANALYZE tablename; and re-plan before touching anything else.
Nested Loop with a huge inner count. A join executed row-by-row thousands of times, usually downstream of a bad estimate or a missing index on the join key.

Step 3: Apply the right fix

Fix A: the missing or wrong index

Most slow queries are one index away from fast. Match the index to the query's shape: equality columns first, then range columns. For the query above:

sql

-- status is an equality filter, created_at a range filter:
CREATE INDEX CONCURRENTLY idx_orders_status_created
ON orders (status, created_at);

CONCURRENTLY builds the index without locking writes - essential on a production table. Re-run the EXPLAIN afterwards and confirm the Seq Scan became an Index Scan and the runtime dropped. If it didn't, the planner had a reason; check the estimate mismatch signal again.

Fix B: reshape the query

When an index exists but can't be used, the query shape is usually at fault:

Functions on indexed columns - WHERE lower(email) = $1 needs an expression index on lower(email), or store the value normalised.
N+1 loops - replace per-row lookups with one query using WHERE id = ANY($1) or a join. This is an application change, but it's the single biggest win in ORM-heavy codebases.
Overfetching - SELECT * on wide tables drags TOASTed columns and blocks index-only scans. Name the columns.
OFFSET pagination on deep pages - OFFSET 100000 reads and discards 100k rows. Use keyset pagination (WHERE id > $last LIMIT 50) instead.

Fix C: give the planner fresh statistics

PostgreSQL plans queries using sampled statistics that autovacuum refreshes opportunistically. After bulk loads, mass deletes, or on very fast-growing tables, those samples lag reality and the planner picks catastrophic plans with full confidence. ANALYZE is instant and safe to run any time; if a table's stats lag chronically, lower its autovacuum_analyze_scale_factor.

Make it a habit, not a heroic

The teams with fast databases aren't smarter - they just look at the top-10 query list monthly and after every feature launch, when a regression is one commit old instead of one quarter old. A managed platform should make this loop effortless: Swyftstack's managed PostgreSQL surfaces query rate, active connections, and a slow query log on the dashboard, so step 1 is a glance instead of an extension install. And because it's standard PostgreSQL, every EXPLAIN technique in this post works unchanged.

Reading a real plan, line by line

Here is the kind of output that intimidates people, trimmed to the parts that matter. Suppose the orders query above produces this plan fragment:

sql

Nested Loop  (actual time=0.4..5205.1 rows=1842 loops=1)
  -> Seq Scan on orders o
       (actual time=0.2..5100.7 rows=1842 loops=1)
       Filter: (status = 'paid' AND created_at > ...)
       Rows Removed by Filter: 4,318,240
  -> Index Scan using customers_pkey on customers c
       (actual time=0.05..0.05 rows=1 loops=1842)

Three readings jump out once you know where to look. The Seq Scan on orders consumed 5.1 of the 5.2 total seconds - that line IS the problem; everything else is noise. 'Rows Removed by Filter: 4.3 million' tells you the scan read the whole table to keep 1,842 rows - a selectivity of 0.04%, which is exactly the profile an index exists for. And the customers side is already healthy: an index scan executed 1,842 times at 0.05 ms each is the nested loop working as designed. The fix writes itself - the composite index on (status, created_at) from Fix A - and the re-run plan should show an Index Scan on orders with the total time dropping from seconds to single-digit milliseconds.

This is the general skill: do not read plans top to bottom like prose. Find the node that owns the wall-clock time, check whether its actual row count embarrasses its estimate, and ask what access path would have served its filter. Ninety percent of plan-reading is those three questions applied to one node.

One index, one N+1 fix, and one ANALYZE resolve the overwhelming majority of slow-query incidents. The skill is knowing which of the three you're looking at - and EXPLAIN ANALYZE tells you.

Summary

EXPLAIN ANALYZE runs the query and reports real timings per plan node. Find the node that owns the time, check whether its actual rows embarrass the estimate, and ask what access path its filter needed. One index, one N+1 fix, or one ANALYZE resolves most incidents.

Next steps

Managed Postgres with a slow-query log

Frequently asked questions

What is the difference between EXPLAIN and EXPLAIN ANALYZE?

EXPLAIN shows the plan PostgreSQL intends to use, with cost estimates. EXPLAIN ANALYZE actually executes the query and reports real row counts and timings per plan node. Always use ANALYZE when diagnosing (on a replica or with care for writes - it really runs the query).

Why does PostgreSQL ignore my index?

Common reasons: the query wraps the indexed column in a function or cast (lower(email) cannot use a plain index on email), the planner estimates the query will return a large fraction of the table (sequential scan is genuinely cheaper), or table statistics are stale - run ANALYZE and re-check.

Is SELECT * actually bad for performance?

It can be. Fetching columns you don't need inflates I/O and memory, and it prevents index-only scans - where PostgreSQL answers the query entirely from the index without touching the table. Select the columns you use, especially on wide tables.

How slow is slow enough to care about?

Judge by total impact, not per-query time. A 40 ms query executed 50 times per page load matters more than a 2-second nightly report. pg_stat_statements' total_exec_time column ranks queries by exactly this.