1. PostgreSQL Internal Architecture
The Internal Mechanics of Postgres
PostgreSQL is not a simple 'data store'. It is a complex distributed system designed for 100% data reliability. To master it, you must first understand how it handles memory, processes, and the 'contracts' it makes with your data.
🐘 The Multi-Process Model
Unlike MySQL or SQL Server which are multi-threaded, PostgreSQL uses a process-per-connection model. When a client connects to the database, the 'Postmaster' process forks a new dedicated worker process for that user.
If one worker process crashes due to a memory leak or bug, it cannot bring down the entire database server. Each process is isolated by the OS.
Postgres uses shared memory to communicate between these processes safely, protected by semaphores and spinlocks.
This model is highly efficient on Linux systems where process forking is extremely fast, though it makes connection pooling mandatory at scale.
🛡️ Shared Memory: The 'Shared Buffers'
Postgres doesn't trust the Operating System to handle all caching. It reserves its own block of RAM called Shared Buffers. When you query a table, Postgres loads the data 'Pages' (8KB blocks) from the hard drive into this RAM.
Why RAM matters
Reading data from disk takes ~10 milliseconds. Reading from RAM takes ~100 nanoseconds. That is a 100,000x speed difference. This is why a well-tuned Postgres server with enough RAM feels instant.
📜 The Write-Ahead Log (WAL)
How does Postgres guarantee that your data is safe even if the power goes out? The answer is the WAL. Instead of writing your changes directly to the massive main data files (which is slow), it first appends the change to a lightweight 'Journal' called the WAL.
If the server crashes, Postgres reads the WAL on restart and 'replays' any changes that didn't make it to the main files yet. This is the foundation of Durability.
Knowledge Check
Ready to test your understanding of 1. PostgreSQL Internal Architecture?