Your Postgres Autovacuum Is Probably Misconfigured
Here’s a pattern I’ve seen dozens of times. A startup’s Postgres database has been running fine for months — queries are fast, the team is shipping features, nobody thinks about the database. Then one morning, queries that used to take 20ms are taking 200ms. The engineering team checks CPU and memory utilization, reviews the slow query log, looks at connection counts. Nothing obvious has changed. But when someone finally looks at pg_stat_user_tables, the story becomes clear: dead tuple counts in the millions, last_autovacuum timestamps from hours ago, and table sizes that have doubled despite no meaningful data growth.
The other version of this story: an alert fires because the database is approaching transaction ID wraparound. This one usually comes with a side of panic, because if you hit the wraparound limit, Postgres will refuse to process new write transactions to protect data integrity. I’ve gotten more than a few emergency calls that started with this alert.
If either scenario sounds familiar, your autovacuum is almost certainly misconfigured. The good news: the fixes are straightforward, and you can diagnose the problem in about five minutes.
Why Autovacuum Exists
Postgres uses MVCC to handle concurrent access. When you UPDATE a row, Postgres doesn’t modify it in place — it creates a new version and marks the old one as dead. DELETE works the same way: the row is marked dead but not removed. These dead tuples accumulate over time, wasting disk space and forcing scans to wade through rows that no transaction will ever need again.
Autovacuum is the background process that cleans up dead tuples, updates the visibility map (which enables index-only scans), and freezes old transaction IDs to prevent wraparound. This isn’t optional maintenance. When autovacuum falls behind, table bloat grows, query performance degrades, and in the worst case, you’re facing a wraparound emergency.
The core problem is that autovacuum’s default configuration is conservative. It was tuned to be unobtrusive on small databases, not to keep up with the write throughput of a production SaaS application handling thousands of transactions per second. And on AWS RDS, these defaults sit unchanged unless you’ve explicitly modified your parameter group.
Is Autovacuum Your Problem?
Run these queries. They’ll tell you in under a minute whether autovacuum is keeping up.
Dead tuple accumulation is the first thing to check. If dead tuples are piling up faster than vacuum can clean them, everything downstream suffers.
SELECT
schemaname || '.' || relname AS table,
n_live_tup,
n_dead_tup,
round(n_dead_tup::numeric / nullif(n_live_tup, 0) * 100, 1) AS dead_pct,
last_autovacuum,
last_autoanalyze
FROM pg_stat_user_tables
WHERE n_dead_tup > 1000
ORDER BY n_dead_tup DESC
LIMIT 20;
If dead_pct is above 10% on any actively-queried table, autovacuum is falling behind. If last_autovacuum is NULL or several hours old on a heavily-written table, that’s a clear red flag. On a high-throughput table, you should see autovacuum completing every few minutes.
Autovacuum in flight. Check what vacuum is doing right now and whether it’s struggling.
SELECT
pid,
datname,
relid::regclass AS table_name,
phase,
heap_blks_total,
heap_blks_scanned,
heap_blks_vacuumed,
index_vacuum_count,
max_dead_tuples,
num_dead_tuples
FROM pg_stat_progress_vacuum;
Two things matter here. First, index_vacuum_count — if it’s greater than 1, vacuum has already done multiple passes over the table’s indexes because it ran out of memory for dead tuple IDs. This means maintenance_work_mem is too small, and vacuum is doing dramatically more work than necessary. Second, check if num_dead_tuples equals max_dead_tuples. If the buffer is full and vacuum is cycling back to “scanning heap” after “vacuuming indexes,” you’re watching multi-pass vacuum in action.
Transaction ID wraparound risk. This is the one that can take your database offline.
SELECT
c.oid::regclass AS table_name,
age(c.relfrozenxid) AS xid_age,
pg_size_pretty(pg_table_size(c.oid)) AS table_size
FROM pg_class c
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.relkind = 'r'
AND n.nspname NOT IN ('pg_catalog', 'information_schema')
ORDER BY age(c.relfrozenxid) DESC
LIMIT 20;
The xid_age column shows how close each table is to needing a transaction ID freeze. The default autovacuum_freeze_max_age is 200 million, so autovacuum should aggressively freeze tables well before they reach that age. If you see values above 200 million, autovacuum can’t keep up with its freeze obligations. If anything is approaching 1 billion, treat it as an emergency — Postgres enters a safety shutdown at 2 billion to prevent data corruption.
The Usual Culprits
Once you’ve confirmed autovacuum isn’t keeping up, here’s where to look. I’ve ordered these by how frequently they’re the root cause in environments I’ve worked on.
autovacuum_vacuum_cost_delay and autovacuum_vacuum_cost_limit
This is the single most impactful tuning knob for autovacuum throughput, and it’s the first thing I check on every engagement.
Autovacuum uses a cost-based throttling mechanism to avoid saturating your disk I/O. As it processes pages, it accumulates cost points. When the accumulated cost reaches autovacuum_vacuum_cost_limit (default: 200, inherited from vacuum_cost_limit), autovacuum sleeps for autovacuum_vacuum_cost_delay milliseconds before continuing.
The default cost delay is 2ms (reduced from 20ms in PostgreSQL 12, which gives you a sense of how conservative the old value was). Combined with a cost limit of 200, the result is autovacuum that’s dramatically throttled on any modern SSD-backed system — which includes every RDS instance you’re likely running.
Set autovacuum_vacuum_cost_delay to 0 and increase autovacuum_vacuum_cost_limit to at least 1000, preferably 2000. A cost delay of 0 means vacuum runs without artificial pauses, limited only by actual I/O throughput. On RDS, both are parameter group changes that take effect dynamically — no reboot required.
I’ve seen this single change clear dead tuple backlogs from millions to near-zero within hours on databases where autovacuum had been struggling for weeks.
maintenance_work_mem
When vacuum runs, it collects dead tuple IDs in memory, then uses that list to clean up index entries and reclaim heap space. The maintenance_work_mem parameter controls how much memory is available for this collection. The default is 64MB.
On a table with tens of millions of dead tuples, 64MB can’t hold all the dead tuple TIDs at once. When vacuum fills its memory allocation, it stops scanning the heap, vacuums all indexes for the TIDs collected so far, cleans up the corresponding heap tuples, and then resumes scanning the heap from where it left off. Each of these passes requires a full scan of every index on the table. On a table with several large indexes, this turns a 10-minute vacuum into a multi-hour ordeal.
You already know how to spot this: index_vacuum_count greater than 1 in pg_stat_progress_vacuum, or num_dead_tuples hitting max_dead_tuples. The max_dead_tuples value directly reflects how much of maintenance_work_mem is available.
Set maintenance_work_mem to 1GB for most production workloads, or 2GB if you have tables with hundreds of millions of rows. On RDS, this is a parameter group change. Note that this memory is allocated per vacuum worker, so factor in autovacuum_max_workers when sizing — 3 workers at 1GB each means up to 3GB committed to autovacuum at peak.
autovacuum_max_workers
The default of 3 means only 3 tables can be vacuumed concurrently. For a database with dozens or hundreds of actively-written tables, this creates a queue where tables wait their turn and dead tuples accumulate in the meantime.
Here’s the gotcha that catches almost everyone: autovacuum_vacuum_cost_limit is shared across all workers. If you increase autovacuum_max_workers to 6 but leave the cost limit at 200, each worker gets a budget of roughly 33 cost points before sleeping. You’ve doubled the parallelism but halved each worker’s throughput, resulting in approximately the same total vacuum throughput — just spread across more tables that are all progressing slowly.
If you increase workers, increase the cost limit proportionally. Going from 3 to 6 workers? Double the cost limit. My typical recommendation for a busy production database: 5-6 workers with a cost limit of 1500-2000 and a cost delay of 0.
Per-Table Autovacuum Settings
Autovacuum triggers when a table’s dead tuple count exceeds autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * n_live_tup. With the defaults of threshold = 50 and scale_factor = 0.2, vacuum triggers after roughly 20% of a table’s live rows are dead. For a table with 50 million rows, that’s 10 million dead tuples before autovacuum even starts — an enormous backlog that will take a long time to work through.
For your highest-write tables, override these globally-derived thresholds:
ALTER TABLE hot_table SET (
autovacuum_vacuum_scale_factor = 0.01,
autovacuum_vacuum_threshold = 1000
);
This triggers vacuum after roughly 1% dead tuples instead of 20%. On that 50-million-row table, vacuum now kicks in at 500,000 dead tuples instead of 10 million. Identify your top 5-10 busiest tables and apply this treatment to each one individually.
Long-Running Transactions
This is the subtle one, and it trips up teams that have already done everything else right. A single long-running transaction can prevent autovacuum from reclaiming dead tuples across the entire database, no matter how well-tuned the configuration is.
Postgres must retain all tuple versions that might be visible to any active transaction. If a transaction opened 3 hours ago is still open — even if it’s sitting completely idle — Postgres cannot remove any dead tuples created after that transaction’s snapshot was taken. Autovacuum will dutifully run, scan the heap, identify dead tuples, and then discover it can’t actually free them. You’ll see last_autovacuum timestamps updating regularly, but n_dead_tup not dropping.
Common culprits: ORMs that open transactions implicitly and don’t close them, poorly configured connection pools holding idle sessions with open transactions, long-running analytics queries on the primary instead of a replica, and forgotten BEGIN statements in someone’s psql session.
Find them:
SELECT
pid,
usename,
state,
now() - xact_start AS xact_duration,
left(query, 80) AS last_query
FROM pg_stat_activity
WHERE state = 'idle in transaction'
AND xact_start < now() - interval '5 minutes'
ORDER BY xact_start;
As a safety net, set idle_in_transaction_session_timeout to a reasonable value. I start with 30 minutes and tighten it once the application-level causes have been addressed.
Going Deeper
The parameter changes above will resolve most autovacuum backlogs within hours. But if you find yourself constantly re-tuning these knobs, or if dead tuples drop but table bloat remains high (reclaiming bloated space requires VACUUM FULL or pg_repack — a different conversation), the problem is usually one layer deeper: transaction lifecycle management in your application, write amplification from schema design decisions, connection pooling architecture, or simply outgrowing the instance. Autovacuum tuning is the first layer of investigation. The second layer is understanding why your workload is generating dead tuples at a rate that strains even a well-configured vacuum.
If you’ve applied these changes and your dead tuple counts are dropping, you’re in good shape. If you’re still seeing problems after tuning, or if you’re dealing with wraparound pressure on large tables, the issue is likely structural. Feel free to reach out — [email protected] — and we can dig into it together.