When you work with a database, it all seems pretty simple: you add a row, update a record, delete a customer. But behind this simplicity is a complex and well-thought-out mechanism. Where are these data actually stored? How does PostgreSQL manage to not lose anything, even if the server suddenly shuts down?
To figure this out, you need to understand two key things: where the data physically lives (tables, indexes, service info) and how the data protection mechanism works — the transaction log, aka WAL (Write-Ahead Logging).
The entire PostgreSQL database is stored as a set of files in a special directory — data_directory. Usually, it’s located here:
/var/lib/postgresql/17/main
This folder is the heart of the database: it has the tables, indexes, metadata, and configs. The WAL log is here too — the mechanism that takes all changes first. Before any data hits the table on disk, it’s written to WAL. Think of it as a draft where the database records every step, so if something crashes, you can recover everything up to the last operation.
Thanks to this approach, PostgreSQL is super reliable and resilient, even in the most unstable conditions.
Tables
Each table is physically a separate file or a set of files. These files live in the base/ subdirectory. The structure looks like this:
$PGDATA/base/
├── 16384/
│ ├── 12345 ← table
│ ├── 12346 ← index
│ └── ...
16384— this is the internal database identifier (OID).12345— the identifier of a specific table.
If a table is big, PostgreSQL splits it into segments of 1 GB each:
12345
12345.1
12345.2
...
The files don’t store "rows" like in a text CSV — it’s an "8 KB binary page" format.
WAL: Write-Ahead Logging — Not Just a "Log"
Now let’s talk about one of the most important and most misunderstood parts of PostgreSQL — WAL, or Write-Ahead Logging. Even though it’s called a log, WAL is not your typical text log file like error or query logs. It’s a crucial mechanism for data consistency and recovery, working at the low-level file system changes.
WAL isn’t a report of events, it’s a pre-record of all changes PostgreSQL is about to make to your data. This record happens before the actual tables are changed on disk. That’s why it’s called write-ahead — "write in advance".
For example, when you insert a new row into a table, PostgreSQL:
- Does NOT update the table on disk right away — that would be slow and unsafe.
- First writes to WAL that this row will be added.
- Only later, when it’s convenient (like in a background process), the data actually gets written to the table.
It works like a bank check: first you sign it (WAL), and only then the bank updates your account (table). If something goes wrong — you still have the check and can repeat the operation.
WAL Format and Structure
- WAL files are stored in binary format.
- Each file is a strictly ordered stream of operations describing internal changes to data pages, index structures, commits, etc.
- One WAL file has a fixed size — by default, 16 MB.
Important: WAL doesn’t contain "SQL commands" or "table rows" as you know them. It has instructions for the PostgreSQL engine on how to replay changes, page by page.
What Happens If There’s a Crash?
If the PostgreSQL server suddenly shuts down — say, because of a power outage — not all is lost. On the next startup, the database doesn’t freak out, it just loads the last saved "stable" version of the data from disk. Then it grabs the transaction log (WAL), where all the latest changes are, and carefully "rolls them forward" — applies what didn’t make it to the main files yet. In the end, the database is restored to a fully consistent state, like nothing ever happened.
Extra WAL Features
Point-In-Time Recovery (PITR). Keeping WAL files lets you restore the database to any point in time between two full backups.
Streaming Replication. PostgreSQL can send WAL records to another server in real time. This lets you have a hot replica — a database copy that’s synced with the main one.
Incremental Recovery. Together with a full backup, WAL lets you restore just the changes, not copy the whole database again.
Making Binary Backups: pg_basebackup
If you’ve already played with pg_dump, you know it’s great for making logical backups (copying the database structure and data as SQL queries). But what if you need a physical backup? Like, a full mirror copy of all the database files? That’s where pg_basebackup comes in.
pg_basebackup is a tool for making physical copies of PostgreSQL data. It’s especially handy for big databases where you need to manage recovery efficiently. The main advantage of pg_basebackup is that it’s super fast.
Basic pg_basebackup Command Syntax
Using pg_basebackup starts with understanding its command. You run it in your terminal, and the basic form looks like this:
pg_basebackup -D /backup_directory -F tar -z -P
Let’s break down what’s going on here:
-D /backup_directory— tells it where to save your backup files.-F tar— data format. Thetaroption makes an archive file in.tarformat. You can also useplainto create a directory structure of the database files.-z— compresses the backup, which saves disk space. Always nice when your backup takes up less room!-P— shows real-time progress. It’s like an extra shot of confidence: you see the process is running and the server isn’t "frozen".
Example usage:
pg_basebackup -D /backups/university_backup -F tar -z -P
After running the command, you’ll have a backup in .tar format in the /backups/university_backup directory.
Why Use pg_basebackup?
Efficiency: incremental backups mean you don’t duplicate unchanged data, saving both time and space.
Easy to use: the pg_basebackup tool handles all the details for you, including WAL files.
Reliability: thanks to PostgreSQL’s mechanics, pg_basebackup makes exact copies of the whole database, which you can easily restore.
Usage Examples
Now — let’s get practical. Here are real examples of how you can use pg_basebackup to make PostgreSQL database backups. We’ll show how to make a basic backup, add compression, and enable WAL archiving for point-in-time recovery. These commands are great for both beginners and advanced scenarios.
Making a Basic Backup
pg_basebackup -D /backups/full_backup -F tar -z -P
Result: a full database backup in a .tar archive.
Setting Compression and Format
Let’s make a backup with high compression:
pg_basebackup -D /backups/full_backup -F tar -z -Z 9 -P
Here, -Z 9 sets the compression level (max is 9).
WAL Archiving
If you set up WAL archiving, the database can be restored to any point in time. The command to enable WAL backup:
pg_basebackup -D /backups/incremental_backup -F tar -z -P --wal-method=archive
GO TO FULL VERSION