Postgres Internals Hiding in Plain Sight
Postgres Internals Hiding in Plain Sight is more than just a title- it reflects the hidden mechanisms within PostgreSQL that silently drive performance, reliability, and scalability behind every query you run.
1. The Architecture: Processes and Memory (The Obvious Invisible)
When you run ps aux | grep postgres, you see processes. But what are they actually doing?
The Postmaster (The Silent Guardian)
-
What it is: The first process. It listens on the network port (5432) and forks child processes.
-
Hidden in plain sight: It never touches data. If it crashes, the entire instance crashes. Most people don’t realize that
pg_ctl restartkills the postmaster, which then reaps all backends. -
The secret: The postmaster also manages the
pg_ctlstatus check via a.pidfile. Deleting that file while the postmaster is running tricks the system into thinking it’s offline.
The Background Writer (The Invisible Janitor)
-
What it does: Periodically writes dirty shared buffers to disk.
-
Hidden in plain sight: You never call it. But if it falls behind, your
SELECTqueries slow down because the foreground process (backend) becomes the writer. -
The knob:
bgwriter_delay(default 200ms). Most people ignore it until a checkpoint storm happens.
The WAL Writer (The Unsung Hero)
-
What it does: Flushes Write-Ahead Log (WAL) buffers to disk.
-
Hidden in plain sight: Your
COMMITis not truly durable until the WAL writer (or a backend) flushes to the OS. Thesynchronous_commit = offtrick works because you’re allowing a delay in this writer.
2. The System Catalogs: Your Database’s Autobiography
Every piece of schema, table, column, and index is just a row in a table. Those tables are in the pg_catalog schema.
pg_class – The Table of All Tables
-
Hidden fact: Your table
usershas an entry inpg_classwithrelkind = 'r'. But so does every index, sequence, view, and materialized view. -
The trick:
SELECT reltuples, relpages FROM pg_class WHERE relname = 'users';– This gives you an estimate of row count and disk blocks.reltuplesis only updated byANALYZEor auto-vacuum. Many rely onCOUNT(*)(slow) when this is right there (fast, but approximate).
pg_stat_all_tables – The Activity Dashboard
-
Hidden in plain sight:
seq_scanandseq_tup_readtell you when a full table scan happened. Ifseq_scanis high on a large table with an index, you have a problem. -
The gem:
n_live_tupvsn_dead_tup. If dead tuples exceed live ones, you have a bloat disaster waiting to happen. Most monitoring tools show this, but few developers look directly.
pg_locks – The Silent Contention Map
-
Hidden in plain sight: Run
SELECT * FROM pg_locks WHERE granted = false;– This shows every query currently waiting on a lock. Most people only see the “waiting” inpg_stat_activity. Butpg_locksshows the type of lock (row share, exclusive, advisory, etc.). -
The secret: Advisory locks (
SELECT pg_advisory_lock(123)) are stored here. They are not tied to any row, but they block exactly like row locks.
3. MVCC: The Time Machine Hiding in Every Row
Multi-Version Concurrency Control (MVCC) is the crown jewel. It allows readers to never block writers. But the implementation is hiding in every row header.
The Hidden Columns (xmin, xmax, cmin, cmax)
Every row has invisible system columns:
-
xmin– The transaction ID that inserted this row. -
xmax– The transaction ID that deleted or updated this row (0 if active). -
cmin/cmax– Command IDs within the transaction.
Example:
SELECT xmin, xmax, * FROM users;
-
Hidden in plain sight: You see two rows after an update, but you don’t see that the old row has
xmax= current transaction, and the new row hasxmin= current transaction. -
The implication: A
DELETEdoesn’t remove data. It just setsxmax. The row is “dead” and vacuum will clean it later.
The Free Space Map (FSM) and Visibility Map (VM)
-
FSM (
_fsmfile): Hides right next to your table file. It tracks empty space in pages.INSERTuses it to find where to put new rows without scanning the whole table. -
VM (
_vmfile): Hides which pages contain only tuples visible to all transactions. AnINDEX ONLY SCANuses this to avoid checking the heap. If the VM is out of date, index-only scans become a lie (they hit the heap anyway).
4. TOAST: The Giant That Lives in the Attic
The Oversized-Attribute Storage Technique (TOAST) handles large fields (text, jsonb, bytea).
The Breaking Point: 2kB
-
Hidden in plain sight: When a row exceeds ~2kB (eight 8kB pages), Postgres silently moves large columns to a separate
pg_toasttable. -
The surprise:
SELECT * FROM huge_text_tablewhere each row has a 10MB JSON blob – you think you’re fetching one row, but Postgres is actually doing a hidden join to the TOAST table. -
The diagnostic:
SELECT reltoastrelid FROM pg_class WHERE relname = 'your_table';– This shows you the OID of the hidden TOAST table. Query it directly:SELECT * FROM pg_toast.pg_toast_12345;
TOAST Compression (pglz or lz4)
-
Hidden in plain sight: By default, large text is compressed before being stored. You pay CPU on insert and select to compress/decompress. Most developers never know until they benchmark.
-
The check:
SELECT attstorage FROM pg_attribute WHERE attname = 'big_column';– ‘x’ means extended (compress + TOAST), ‘e’ means external (no compress, just TOAST).
5. Index Internals: The Map That Hides Its Legend
B-Tree: Not Just a Tree
-
Hidden in plain sight: B-Tree indexes in Postgres include duplicate keys using a posting list or posting tree. The classic “index scan” you learned in CS class is a lie; Postgres has optimized for high duplicates.
-
The meta page: Every index has a meta page (block 0).
SELECT * FROM bt_metap('users_pkey');– This shows the root block, version, and level count. Most people don’t know this function exists.
GIN (Generalized Inverted Index)
-
Used for:
tsvector,jsonb, arrays. -
Hidden in plain sight: A GIN index doesn’t store keys directly to heap tuples. It stores them to a posting tree that points to heap. Updating one element in a
jsonbcolumn marks the entire index entry for the whole row as dead. That’s why write-heavy JSON columns bloat GIN indexes so fast. -
The fix hidden in plain sight:
REINDEXorVACUUMhas a special GIN cleanup mode.gin_pending_list_limitcontrols the trade-off.
BRIN (Block Range Index)
-
For massive tables (e.g., logs).
-
Hidden in plain sight: It stores only the min/max value per block range (default 128 pages). It’s tiny and fast, but it returns false positives.
-
The secret:
SELECT * FROM brin_summarize_range('idx', 0);– You can manually summarize ranges. Most people think it’s automatic, but it happens only during vacuum.
6. Hiding in the Logs: The WAL Internals
Write-Ahead Logging is not just crash recovery. It’s a full audit trail.
LSN (Log Sequence Number)
-
What it is: A pointer to a byte offset in the WAL.
-
Hidden in plain sight:
SELECT pg_current_wal_lsn();– You can get the current LSN. Every page in shared buffers has an LSN. If the page’s LSN > WAL’s LSN, the page is dirty. -
The replication trick:
SELECT pg_wal_replay_pause();– You can pause replication at an exact LSN, examine the replica, then resume. Most people don’t know they can do this without breaking replication.
WAL Files: The Hidden Time Series
-
Location:
pg_waldirectory (formerlypg_xlog). -
Hidden in plain sight: Each file is exactly 16MB (by default). They are recycled, not deleted. After a checkpoint, old WALs are renamed, not removed. You can run out of disk space even if you think you have no logs.
-
The diagnostic:
SELECT * FROM pg_ls_waldir();– Shows you every WAL file with size and modification time.
7. The Query Planner’s Hidden Assumptions
The planner uses statistics, but those statistics are often stale or wrong.
n_distinct
-
Hidden in plain sight:
pg_stats.n_distincttells the planner how many unique values in a column. If negative (e.g., -0.1), it means “10% unique” – an estimate. If it’s wrong, the planner chooses a sequential scan when an index would win. -
The fix:
ALTER TABLE ... SET (n_distinct = 500);– You can manually override the statistic.
Correlation (pg_stats.correlation)
-
What it is: How well the physical row order matches the logical index order (1 = perfect, -1 = reverse, 0 = random).
-
Hidden in plain sight: If
correlationis low (near 0), an index scan will bounce all over the heap and be slower than a sequential scan. The planner knows this and will avoid the index. Developers see a missing index, but the index exists – the correlation is just terrible.
The Cached Plan Trap (Prepared Statements)
-
Hidden in plain sight: Postgres caches query plans for prepared statements. On the 5th execution, it stops re-planning. If your data distribution changes dramatically after the 5th execution, you get a suboptimal plan that is cached.
-
The escape hatch:
DEALLOCATE prep_stmt;or setplan_cache_mode = force_custom_plan;
Practical Takeaways: Seeing What’s Hidden
| You see | What’s hidden | How to see it |
|---|---|---|
| A slow query | Waiting on a lock | pg_locks where granted=false |
| Large table | Bloat from dead tuples | pg_stat_user_tables.n_dead_tup |
| Index scan | Visibility map check | EXPLAIN (ANALYZE, BUFFERS) – look for “Heap Fetches” |
| Row updated | Old row still on disk | pageinspect extension |
| High disk I/O | Checkpoint writes | pg_stat_bgwriter – checkpoints_req |
| Replica lag | Replay LSN | pg_stat_replication |
Conclusion
PostgreSQL’s internals are not secret they are documented in system catalogs, background processes, and subtle behaviors. The “hiding in plain sight” phenomenon occurs because developers learn the SQL interface but never query the database’s self-describing metadata (pg_catalog), ignore the process model, or misunderstand MVCC’s physical reality. Mastering Postgres means learning to ask the database itself: “What are you really doing?” The answer is always there, in plain sight.
If you’re looking to truly understand database performance beyond just writing queries, mastering PostgreSQL internals is a game-changer and that’s exactly where Learnomate Technologies comes in. At Learnomate, we don’t just teach SQL; we help you dive deep into real-world concepts like MVCC, WAL, query planning, and performance tuning, ensuring you gain practical, job-ready skills. Whether you’re aiming for a career as a Database Administrator or Data Engineer, our hands-on training, expert mentorship, and interview-focused approach will give you the edge needed to stand out in today’s competitive tech landscape.





