CodeGym /Courses /SQL SELF /What is EXPLAIN and How to Use It to Analyz...

What is EXPLAIN and How to Use It to Analyze Queries

SQL SELF
Level 41 , Lesson 2
Available

Imagine you’re building an app, and suddenly one of your queries becomes the most expensive one in your database. You start noticing crashes and slowdowns in your app. That’s when EXPLAIN comes to the rescue, helping you figure out where things went sideways. Optimizing queries based on EXPLAIN analysis lets you save resources, win back time, and make your users a lot happier with your app’s performance.

EXPLAIN is your way to peek inside PostgreSQL and see exactly how the database plans to run your query. It shows whether an index will be used or if it’s just going to do a full table scan, what steps the optimizer will take, in what order, and how big the intermediate results will be.

In other words, EXPLAIN lets you know what to expect from running your query: how “heavy” it is, how many rows it’s expected to process, and what resources will be involved. It’s an essential tool when a query suddenly starts lagging and you need to figure out why.

EXPLAIN is like a flashlight in the dark: with it, you can see what’s happening under the hood and exactly where things are going wrong.

Syntax of the EXPLAIN Command

Let’s check out the basic syntax for the EXPLAIN command:

EXPLAIN your_SQL_query;

Example query:

EXPLAIN SELECT * FROM students WHERE age > 20;

When you run this command, PostgreSQL won’t actually execute the query. Instead, it’ll show you the execution plan. Think of it like a blueprint before construction — it’s super helpful to see what’s going to happen before you break anything.

Here’s a sample output:

Seq Scan on students  (cost=0.00..35.00 rows=7 width=37)
  Filter: (age > 20)

This output might look scary, but don’t worry — we’ll break down the main parts right now.

Breaking Down a Basic Execution Plan

Let’s go through that result:

  1. Seq Scan on students — this means PostgreSQL will scan the students table completely (sequential scan). That’s not always bad, but for big tables, Seq Scan can be slow.

  2. cost=0.00..35.00 — this is the estimated cost of the operation:

    • Startup Cost: the initial cost of the operation (here it’s 0.00).
    • Total Cost: the total cost to finish the operation (here it’s 35.00).
  3. rows=7 — PostgreSQL thinks the condition age > 20 will return 7 rows. This is called “cardinality.” If you see weird estimates here, it might mean your table stats are outdated.

  4. width=37 — this is the average size of one row in bytes.

  5. Filter: (age > 20) — this means PostgreSQL will apply a filter, checking each row.

So, the EXPLAIN output gives you insight into PostgreSQL’s strategies and assumptions. You can use this info to optimize things.

Options for the EXPLAIN Command

While the basic EXPLAIN output is already useful, you can tweak it with these options:

ANALYZE

With this option, PostgreSQL not only shows the execution plan, but actually runs the query and gives you real data. Example:

EXPLAIN ANALYZE SELECT * FROM students WHERE age > 20;

This lets you compare PostgreSQL’s guesses with what really happens, so you can see how close they are.

VERBOSE

Shows extra details, which are handy for deep dives. Example:

EXPLAIN VERBOSE SELECT * FROM students WHERE age > 20;

BUFFERS

Shows memory buffer usage during query execution. Use it together with ANALYZE:

EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM students WHERE age > 20;

COSTS

If you want to hide or show cost info (cost), use this option:

EXPLAIN (COSTS OFF) SELECT * FROM students WHERE age > 20;

FORMAT

The plan output can be shown in other formats, like JSON or XML. Example:

EXPLAIN (FORMAT JSON) SELECT * FROM students WHERE age > 20;

Example of Using EXPLAIN

Let’s look at a university database with a students table. Say you want to find all students older than 20:

EXPLAIN SELECT * FROM students WHERE age > 20;

The output might look like this:

Seq Scan on students  (cost=0.00..35.00 rows=7 width=37)
  Filter: (age > 20)

Like we mentioned, this is a sequential scan (Seq Scan), which can be inefficient for big tables.

Now let’s create an index on the age column and see if the plan changes:

CREATE INDEX age_index ON students(age);

EXPLAIN SELECT * FROM students WHERE age > 20;

Output:

Index Scan using age_index on students  (cost=0.15..4.23 rows=7 width=37)
  Index Cond: (age > 20)

Now PostgreSQL uses an index scan (Index Scan), which is usually faster, especially for big tables.

Common Questions and Mistakes

Why is my query slow even though I have an index?

Maybe your query returns too many rows, making the index less useful. The index might also be low quality or outdated.

What if the EXPLAIN output is hard to understand?

Start with simple queries and study the execution plan nodes one operation at a time.

How do I know if my table stats are outdated?

Run the command ANALYZE students

When should I use EXPLAIN without ANALYZE?

If you want to see the plan without actually running the query (for example, for queries that modify data).

2
Task
SQL SELF, level 41, lesson 2
Locked
Using `EXPLAIN` with Indexing
Using `EXPLAIN` with Indexing
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION