Picking a data type is like picking the right tool for the job. You wouldn’t use a screwdriver to hammer a nail (I hope). Same thing in a database: the right data type seriously affects performance, memory usage, and how easy it is to manage your data. For example, if you store money as REAL, you might run into accuracy issues, and using TEXT instead of VARCHAR for short strings just wastes memory.
When choosing a data type, keep a few things in mind:
Nature of the data
Figure out what kind of data you’re dealing with: numbers, strings, booleans, dates, or something more complex like JSON structures.
Amount of data
How much data are you planning to store? For example, for texts up to 50 characters, it’s better to use VARCHAR(50) instead of TEXT.
Precision and range
Do you need high precision (like for financial calculations)? Do you want to limit the range of values?
Query frequency and pattern
How often will you access the data? Keep in mind that complex types like JSONB need more resources to process.
Examples of Choosing Data Types
Different data, different needs. Sometimes you need accuracy down to the penny, sometimes you just want everything to fit. Here are some examples of how to pick the right data type for a specific situation.
Financial Data
When it comes to storing money, just like in accounting, mistakes are a no-go. So the best choice is the NUMERIC type, which gives you high precision. For example, let’s say we want a table with these columns:
| Column Name | Data Type | Comment |
|---|---|---|
| id | SERIAL | Primary key |
| amount | NUMERIC(10, 2) | Ten digits, two after the decimal point |
| currency_code | CHAR(3) | ISO currency code, like "USD", "EUR" |
| transaction_date | TIMESTAMP | Transaction time, defaults to current time |
Why not REAL? Because floating point numbers can lose precision, and that’s a big deal for money.
Text Data
If you’re storing usernames, addresses, or any other strings, always set a max length when you can. For example, VARCHAR(50) instead of TEXT. This helps avoid possible errors and saves memory.
| Column Name | Data Type | Comment |
|---|---|---|
| id | SERIAL | Primary key |
| username | VARCHAR(50) | User login, unique and required |
| VARCHAR(255) | Email address | |
| bio | TEXT | Bio, can be a long text |
If the string length is strictly fixed (like two-letter country codes), use CHAR:
| Column Name | Data Type | Comment |
|---|---|---|
| code | CHAR(2) | ISO country code (like "US"), primary key |
| name | VARCHAR(100) | Country name, required |
Timestamp Data
To store event times or schedules, TIMESTAMP is usually the way to go since it holds both date and time.
| Column Name | Data Type | Comment |
|---|---|---|
| id | SERIAL | Primary key |
| event_name | VARCHAR(100) | Event name |
| start_time | TIMESTAMP | Event start time, required |
| end_time | TIMESTAMP | Event end time, required |
If you only need the time without the date, use TIME, and if you only need the date without the time — DATE.
Unique Identifiers
When you need to create IDs that are unique globally, use UUID. For example, to generate a unique transaction ID:
| Column Name | Data Type | Comment |
|---|---|---|
| request_id | UUID | Unique request ID, generated by default with gen_random_uuid() |
| endpoint | VARCHAR(255) | API endpoint address |
| timestamp | TIMESTAMP | Request time, defaults to current time |
JSONB: Complex Structures
When you need to store data with a structure that can change or is complex, like user settings or metadata, use JSONB.
| Column Name | Data Type | Comment |
|---|---|---|
| user_id | SERIAL | Primary key (user ID) |
| preferences | JSONB | User settings in JSON format |
JSONB is super handy, but keep in mind it can slow things down if you’re doing a ton of inserts/updates.
Arrays
Arrays are great if you need to store lists of similar data. For example, a list of tags:
| Column Name | Data Type | Comment |
|---|---|---|
| id | SERIAL | Primary key |
| title | VARCHAR(255) | Article title |
| tags | TEXT[] | Array of tags |
Data Type Overview by Task
| Task Type | Recommended Data Type | Example |
|---|---|---|
| Record identifier | SERIAL, BIGSERIAL, UUID |
id SERIAL PRIMARY KEY |
| Quantity, integers | INTEGER, BIGINT |
quantity INTEGER |
| Financial calculations | NUMERIC |
price NUMERIC(10, 2) |
| Storing strings | VARCHAR(n), TEXT |
username VARCHAR(50) |
| Short fixed strings | CHAR(n) |
status CHAR(1) |
| Dates and time | DATE, TIME, TIMESTAMP |
created_at TIMESTAMP DEFAULT NOW() |
| Unique identifiers | UUID |
id UUID PRIMARY KEY DEFAULT gen_random_uuid() |
| Complex structures (JSON) | JSONB |
metadata JSONB |
| Lists of values | ARRAY |
tags TEXT[] |
| Boolean value | BOOLEAN |
is_active BOOLEAN |
You’ve probably already seen all these types except SERIAL. The thing is, SERIAL is basically just an INTEGER that’s used as a row identifier in tables.
It automatically increases by 1 when you add a new row to the table. So if you add 10 rows, the first row’s id will be 1, the second will be 2, and so on. You’ll learn more about all this in the next lecture :)
GO TO FULL VERSION