First question: why are we even using JSONB? JSONB lets you store data in JSON format, giving you a flexible structure. This is super handy when your data has complex and nested relationships (like user profiles with a list of addresses or settings). Unlike plain JSON, JSONB stores data in a binary format, which makes search and filter operations way faster.
But without indexes, searching through JSONB can be pretty slow, especially if your table has thousands or millions of rows. For example, imagine you have a table with user info, where each user's settings are stored as JSONB. Trying to find all users with a certain value in those settings without indexes is a resource hog. And that's where our indexes come to the rescue!
Indexing JSONB: Key Points
For working with JSONB, PostgreSQL supports two main ways to index:
- GIN (Generalized Inverted Index) — for searching keys and values inside
JSONB. - BTREE — for simpler searching and sorting.
Each has its own quirks. Let’s break them down a bit more.
GIN Index for JSONB
GIN is a powerful index that works with arrays, texts, and also JSONB data. It "breaks down" the JSONB object into separate keys and values, creating a special structure for fast searching through all that goodness.
Advantages of GIN for JSONB:
- Lets you search by both keys and values.
- Works with nested structures.
- Speeds up operations with
@>,?,?|,?&operators (filtering keys and values).
Let’s say we have a users table, where the settings column stores user settings as JSONB. Example data:
CREATE TABLE users (
id SERIAL PRIMARY KEY,
name TEXT,
settings JSONB
);
INSERT INTO users (name, settings) VALUES
('Alice', '{"theme": "dark", "notifications": {"email": true, "sms": false}}'),
('Bob', '{"theme": "light", "notifications": {"email": false, "sms": true}}'),
('Charlie', '{"theme": "dark", "notifications": {"email": true, "sms": true}}');
Now we want to quickly find all users with the dark theme (theme: dark). First, let’s create an index:
CREATE INDEX idx_users_settings_gin ON users USING GIN (settings);
Next, run a query with the @> operator (search by value):
SELECT name
FROM users
WHERE settings @> '{"theme": "dark"}';
Now PostgreSQL uses the GIN index for searching, and the query runs way faster.
How does this work? When you create a GIN index on a JSONB column, PostgreSQL builds an "inverted" index, meaning it creates separate records for all the JSON keys and values. For example, from the object:
{"theme": "dark", "notifications": {"email": true, "sms": false}}
it’ll index the keys theme, notifications.email, notifications.sms and their values. This makes searching for individual elements way faster.
BTREE Index for JSONB
BTREE is your classic index type. It’s used if you need to compare whole JSONB objects or do sorting. But, unlike GIN, BTREE doesn’t break down the JSON object’s contents.
Advantages of BTREE for JSONB:
- Great for sorting and comparing whole objects.
- Faster if you use
JSONBas a "monolith" (like comparing it to another object or searching for rows whereJSONBequals a given value).
Here’s an example of using a BTREE index. Suppose in the users table, you often compare the settings column to a specific object:
{"theme": "dark", "notifications": {"email": true, "sms": false}}
First, create the index:
CREATE INDEX idx_users_settings_btree ON users USING BTREE (settings);
Now you can run object comparison queries:
SELECT name
FROM users
WHERE settings = '{"theme": "dark", "notifications": {"email": true, "sms": false}}';
This query will use the BTREE index to speed things up.
Comparing GIN and BTREE
| Feature | GIN |
BTREE |
|---|---|---|
Breaks down JSONB object |
Yep, splits into keys and values | Nope, compares as a whole |
| Search in nested structures | Yep | Nope |
| Sorting | Nope | Yep |
| Index size | Bigger | Smaller |
| Supported operators | @>, ?, ?|, ?& |
=, sorting |
So, GIN is great for more complex queries, while BTREE is handy when you need to compare whole objects or sort stuff.
Which Index Should You Pick?
- If you want to search by individual keys and values inside
JSONB, go withGIN. - If you need to compare or sort whole
JSONBobjects,BTREEis your friend.
But remember, nobody says you can’t combine these indexes! For example, you can create both GIN and BTREE indexes on the same field if your table needs both types of queries.
Common Mistakes When Indexing JSONB
Creating unnecessary indexes: It’s not always worth it to index every JSONB field. Indexes take up space and can slow down insert and update operations.
Indexing rarely-used operators: Don’t index a field just because it "feels right." Analyze your queries and only use indexing where it actually speeds things up.
Ignoring GIN quirks: GIN can take longer to build than BTREE. Keep this in mind when indexing big tables.
Practical Use
Working with JSONB is super useful in real-world projects where data is flexible and dynamic. For example:
- Web apps with user settings.
- Storing logs that have different fields for different events.
- Caching data in JSON format.
Indexing this data with GIN and BTREE can seriously boost your query performance. For example, in interviews you can show how you sped up a system by adding indexing for complex data structures.
The official PostgreSQL docs on JSON indexes are available here. Don’t forget to check them out for details and examples.
GO TO FULL VERSION