CodeGym /Courses /SQL SELF /Indexing JSONB Data: Using GIN

Indexing JSONB Data: Using GIN and BTREE Indexes

SQL SELF
Level 38 , Lesson 0
Available

First question: why are we even using JSONB? JSONB lets you store data in JSON format, giving you a flexible structure. This is super handy when your data has complex and nested relationships (like user profiles with a list of addresses or settings). Unlike plain JSON, JSONB stores data in a binary format, which makes search and filter operations way faster.

But without indexes, searching through JSONB can be pretty slow, especially if your table has thousands or millions of rows. For example, imagine you have a table with user info, where each user's settings are stored as JSONB. Trying to find all users with a certain value in those settings without indexes is a resource hog. And that's where our indexes come to the rescue!

Indexing JSONB: Key Points

For working with JSONB, PostgreSQL supports two main ways to index:

  1. GIN (Generalized Inverted Index) — for searching keys and values inside JSONB.
  2. BTREE — for simpler searching and sorting.

Each has its own quirks. Let’s break them down a bit more.

GIN Index for JSONB

GIN is a powerful index that works with arrays, texts, and also JSONB data. It "breaks down" the JSONB object into separate keys and values, creating a special structure for fast searching through all that goodness.

Advantages of GIN for JSONB:

  • Lets you search by both keys and values.
  • Works with nested structures.
  • Speeds up operations with @>, ?, ?|, ?& operators (filtering keys and values).

Let’s say we have a users table, where the settings column stores user settings as JSONB. Example data:

CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name TEXT,
    settings JSONB
);

INSERT INTO users (name, settings) VALUES
('Alice', '{"theme": "dark", "notifications": {"email": true, "sms": false}}'),
('Bob', '{"theme": "light", "notifications": {"email": false, "sms": true}}'),
('Charlie', '{"theme": "dark", "notifications": {"email": true, "sms": true}}');

Now we want to quickly find all users with the dark theme (theme: dark). First, let’s create an index:

CREATE INDEX idx_users_settings_gin ON users USING GIN (settings);

Next, run a query with the @> operator (search by value):

SELECT name
FROM users
WHERE settings @> '{"theme": "dark"}';

Now PostgreSQL uses the GIN index for searching, and the query runs way faster.

How does this work? When you create a GIN index on a JSONB column, PostgreSQL builds an "inverted" index, meaning it creates separate records for all the JSON keys and values. For example, from the object:

{"theme": "dark", "notifications": {"email": true, "sms": false}}

it’ll index the keys theme, notifications.email, notifications.sms and their values. This makes searching for individual elements way faster.

BTREE Index for JSONB

BTREE is your classic index type. It’s used if you need to compare whole JSONB objects or do sorting. But, unlike GIN, BTREE doesn’t break down the JSON object’s contents.

Advantages of BTREE for JSONB:

  • Great for sorting and comparing whole objects.
  • Faster if you use JSONB as a "monolith" (like comparing it to another object or searching for rows where JSONB equals a given value).

Here’s an example of using a BTREE index. Suppose in the users table, you often compare the settings column to a specific object:

{"theme": "dark", "notifications": {"email": true, "sms": false}}

First, create the index:

CREATE INDEX idx_users_settings_btree ON users USING BTREE (settings);

Now you can run object comparison queries:

SELECT name
FROM users
WHERE settings = '{"theme": "dark", "notifications": {"email": true, "sms": false}}';

This query will use the BTREE index to speed things up.

Comparing GIN and BTREE

Feature GIN BTREE
Breaks down JSONB object Yep, splits into keys and values Nope, compares as a whole
Search in nested structures Yep Nope
Sorting Nope Yep
Index size Bigger Smaller
Supported operators @>, ?, ?|, ?& =, sorting

So, GIN is great for more complex queries, while BTREE is handy when you need to compare whole objects or sort stuff.

Which Index Should You Pick?

  • If you want to search by individual keys and values inside JSONB, go with GIN.
  • If you need to compare or sort whole JSONB objects, BTREE is your friend.

But remember, nobody says you can’t combine these indexes! For example, you can create both GIN and BTREE indexes on the same field if your table needs both types of queries.

Common Mistakes When Indexing JSONB

Creating unnecessary indexes: It’s not always worth it to index every JSONB field. Indexes take up space and can slow down insert and update operations.

Indexing rarely-used operators: Don’t index a field just because it "feels right." Analyze your queries and only use indexing where it actually speeds things up.

Ignoring GIN quirks: GIN can take longer to build than BTREE. Keep this in mind when indexing big tables.

Practical Use

Working with JSONB is super useful in real-world projects where data is flexible and dynamic. For example:

  • Web apps with user settings.
  • Storing logs that have different fields for different events.
  • Caching data in JSON format.

Indexing this data with GIN and BTREE can seriously boost your query performance. For example, in interviews you can show how you sped up a system by adding indexing for complex data structures.

The official PostgreSQL docs on JSON indexes are available here. Don’t forget to check them out for details and examples.

2
Task
SQL SELF, level 38, lesson 0
Locked
Creating a GIN Index for JSONB Search
Creating a GIN Index for JSONB Search
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION