CodeGym /Courses /SQL SELF /Data Integrity Check

Data Integrity Check

SQL SELF
Level 20 , Lesson 2
Available

Today we’re gonna dive into how foreign keys help us keep our data solid and prevent those classic issues with inconsistent or just plain wrong data.

First off, let’s break down what we actually mean by "data integrity." Imagine you’ve got an orders table (orders) and a customers table (customers). If there’s an order with a customer that doesn’t exist in the customers table, that’s a data integrity fail. It’s super important that all the data in related tables makes logical sense together.

Data integrity means:

  • No "empty" references: if we’re pointing to something in another table, that "something" always exists.
  • Resistant to modification mistakes: if we delete a value from a table that’s referenced elsewhere, the database should warn us or handle it properly.

This is exactly why PostgreSQL uses foreign keys.

How do foreign keys keep data integrity?

When you set up a foreign key in a table, PostgreSQL automatically checks:

  1. Presence of data in the parent table. Before inserting or updating a record, PostgreSQL checks if the referenced foreign key actually exists in the related table.
  2. Deleting or changing data. Before deleting or updating a record in the parent table, PostgreSQL checks if there are any records in the child table pointing to it.

Foreign keys are like a "bouncer" for your data. They won’t let bad data through and make sure your tables play by the rules you set up.

Example: Data integrity in student and course tables

Let’s say we’ve got two tables — students and courses. Each student can sign up for several courses. To show this relationship, we use an enrollments table. Let’s see what happens if someone tries to enroll a student in a course that doesn’t exist.

Step 1. Let’s create three related tables:

CREATE TABLE students (
    student_id SERIAL PRIMARY KEY,
    name TEXT NOT NULL
);

CREATE TABLE courses (
    course_id SERIAL PRIMARY KEY,
    title TEXT NOT NULL
);

CREATE TABLE enrollments (
    enrollment_id SERIAL PRIMARY KEY,
    student_id INT REFERENCES students(student_id),
    course_id INT REFERENCES courses(course_id)
);

Here’s what’s up:

  • In the enrollments table, we clearly set foreign keys student_id and course_id that point to the primary keys in students and courses.

Common data integrity checks

  1. Check when inserting data

If you try to insert a record into enrollments with a student_id or course_id that doesn’t exist, you’ll get an error.

Example:

INSERT INTO enrollments (student_id, course_id)
VALUES (999, 1); -- Error! Student with ID 999 doesn’t exist.

Error message:

ERROR:  insert or update on table "enrollments" violates foreign key constraint "enrollments_student_id_fkey"
DETAIL:  Key (student_id)=(999) is not present in table "students".
  1. Check when deleting data

Let’s try deleting a record from the parent table that’s still referenced.

Example:

INSERT INTO students (name) VALUES ('Alice');
INSERT INTO courses (title) VALUES ('Mathematics');

INSERT INTO enrollments (student_id, course_id)
VALUES (1, 1); -- Successful insert

DELETE FROM students WHERE student_id = 1; -- Error, because the student is still enrolled in a course!

Error message:

ERROR:  update or delete on table "students" violates foreign key constraint "enrollments_student_id_fkey" on table "enrollments"
DETAIL:  Key (student_id)=(1) is still referenced from table "enrollments".

To delete records properly in these cases, we use strategies like CASCADE, SET NULL, or RESTRICT, which we talked about earlier.

Examples of using foreign keys for integrity checks

Example 1: Automatic protection from bad data

With foreign keys, PostgreSQL automatically blocks you from inserting "nonexistent" data:

-- Let’s try to add students that don’t exist to a course:
INSERT INTO enrollments (student_id, course_id)
VALUES (42, 1); -- Error! Student with ID 42 doesn’t exist.

This makes sure a student can’t enroll in a course if they’re not in the students table.

Example 2: Deleting data with ON DELETE CASCADE

If your foreign key is set up with ON DELETE CASCADE, deleting a record in the parent table will also delete the related data in the child table.

ALTER TABLE enrollments DROP CONSTRAINT enrollments_student_id_fkey; -- Drop the old foreign key

ALTER TABLE enrollments
ADD CONSTRAINT enrollments_student_id_fkey FOREIGN KEY (student_id)
REFERENCES students(student_id) ON DELETE CASCADE;

DELETE FROM students WHERE student_id = 1; -- Now records in enrollments will be deleted too

Example 3: Handling changes with ON UPDATE

If your foreign key is set up with ON UPDATE CASCADE, when you change a value in the parent table, PostgreSQL will automatically update the child table too.

-- Set up the foreign key so parent key changes automatically update the child table:
ALTER TABLE enrollments DROP CONSTRAINT enrollments_student_id_fkey;

ALTER TABLE enrollments
ADD CONSTRAINT enrollments_student_id_fkey FOREIGN KEY (student_id)
REFERENCES students(student_id) ON UPDATE CASCADE;

-- Change the student’s ID:
UPDATE students SET student_id = 10 WHERE student_id = 1;

-- Now student_id in enrollments will also update to 10.

Testing data integrity

It’s always a good idea to test how your foreign key settings behave in different scenarios:

  1. Try inserting data with a wrong student_id or course_id.
  2. Delete data from students and see what happens in enrollments.
  3. Update data in students and make sure related records are updated too.

Things to watch out for with foreign keys

Sometimes you’ll run into situations that can trip you up:

  • Missing index. If the parent table (like students) doesn’t have an indexed column being referenced, PostgreSQL might "try" to work slower. So, always make sure the primary key in the parent table is indexed.
  • Cyclic references. If two tables reference each other, it can make inserting data tricky. You’ll need to plan your design more carefully in those cases.
  • Deleting all records. If you need to delete everything with cascading deletes, pay attention to what’s in the child table so you don’t get any surprises.

To dodge these issues, make sure you design your tables smartly and test your relationship rules before rolling them out in a real database.

2
Task
SQL SELF, level 20, lesson 2
Locked
Checking for forbidden references when inserting into a table
Checking for forbidden references when inserting into a table
2
Task
SQL SELF, level 20, lesson 2
Locked
Checking the restriction on deleting referenced data
Checking the restriction on deleting referenced data
Comments
TO VIEW ALL COMMENTS OR TO MAKE A COMMENT,
GO TO FULL VERSION