CodeGym /Courses /SQL SELF /Self-joining Data with SELF JOIN

Self-joining Data with SELF JOIN

SQL SELF
Level 12 , Lesson 3
Available

SELF JOIN is a way to join a table to itself. Sounds kinda weird at first: why would you want to join a table to itself? But honestly, this comes up a lot in real life. For example, imagine you have an employees table, and every employee has a manager. The manager is also an employee, so their info is in the same table. SELF JOIN helps us match up employees with their managers.

Formally speaking, SELF JOIN is just a regular JOIN, but you use the same table twice, giving it different aliases so you can tell the "versions" apart.

You can use SELF JOIN for:

  • Hierarchical relationships: setting up "parent-child" relationships, like employee and their manager.
  • Analyzing data inside a table: comparing records in the table, like finding similar products or events.
  • Complex queries on structures with repeating logic.

Syntax of SELF JOIN

Let’s make it clear with a simple SELF JOIN syntax example:

SELECT
    A.column_name,
    B.column_name
FROM 
    table_name A
JOIN 
    table_name B
ON 
    A.common_column = B.common_column;

Here:

  • table_name A and table_name B are the same table, just with different aliases.
  • A.common_column and B.common_column are the columns you use to join the records.

Aliases (A and B) are needed so the DBMS "knows" which "copy" of the table you’re working with.

Examples of Using SELF JOIN

Example 1: List of Employees and Their Managers

So, let’s say we have an employees table that looks like this:

employee_id name manager_id
1 Alex Lin NULL
2 Maria Chi 1
3 Otto Song 1
4 Nina Zhao 2

In this table:

  • employee_id — the employee’s ID.
  • name — the employee’s name.
  • manager_id — the manager’s ID, which is also the employee_id of another employee.

Goal: get a list of employees and their managers.

Here’s how you do it with SELF JOIN:

SELECT
    e.name AS employee_name,
    m.name AS manager_name
FROM 
    employees e
LEFT JOIN 
    employees m
ON 
    e.manager_id = m.employee_id;

Result:

employee_name manager_name
Alex Lin NULL
Maria Chi Alex Lin
Otto Song Alex Lin
Nina Zhao Maria Chi

Here:

  • Table e is "employees".
  • Table m is also "employees", but in the role of "managers".

Alex Lin doesn’t have a manager (manager_id = NULL), so the manager_name field is empty in the result.

Example 2: Finding Similar Products

Let’s say we have a products table:

product_id product_name category
1 Refrigerator Tech
2 Washing Machine Tech
3 Smartphone Gadgets
4 Tablet Gadgets

Goal: find pairs of products that belong to the same category.

Solution using SELF JOIN:

SELECT
    p1.product_name AS product_1,
    p2.product_name AS product_2
FROM 
    products p1
JOIN 
    products p2
ON 
    p1.category = p2.category
AND 
    p1.product_id < p2.product_id;

Result:

product_1 product_2
Refrigerator Washing Machine
Smartphone Tablet

Notice the condition p1.product_id < p2.product_id. This helps avoid duplicate pairs, so you don’t get both Refrigerator — Washing Machine and Washing Machine — Refrigerator in the results.

Example 3: Analyzing Hierarchies (Parents and Children)

Let’s look at another example. Suppose we have a categories table:

category_id category_name parent_id
1 Tech NULL
2 Gadgets 1
3 Computers 1
4 Smartphones 2

Here:

  • category_id — the category’s ID.
  • category_name — the category’s name.
  • parent_id — the parent category’s ID.

Goal: match up categories and their parent categories.

Query:

SELECT
    c1.category_name AS child_category,
    c2.category_name AS parent_category
FROM 
    categories c1
LEFT JOIN 
    categories c2
ON 
    c1.parent_id = c2.category_id;

Result:

child_category parent_category
Tech NULL
Gadgets Tech
Computers Tech
Smartphones Gadgets

Common Mistakes When Using SELF JOIN

Forgot to use aliases: if you don’t use aliases, it’s impossible to tell one "copy" of the table from the other.

Cyclic references: if your data has cyclic references (like an employee is their own manager), this can lead to weird results.

Duplicate results: don’t forget to filter results (for example, with p1.product_id < p2.product_id) to avoid duplicates.

SELF JOIN is a powerful tool for working with data, and using it right lets you solve tricky problems with hierarchies, relationships inside tables, and data analysis. Hope you see now how useful this "selfie" is in the SQL universe!

2
Task
SQL SELF, level 12, lesson 3
Locked
Self-join to find employees' managers
Self-join to find employees' managers
2
Task
SQL SELF, level 12, lesson 3
Locked
Self-join to find related products
Self-join to find related products
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION