Course Module 4. Working with databases - Lecture: CONSTRAINT: database integrity

Database integrity control

Another important thing to know about databases is CONSTRAINS. With the help of constraints, you can control data changes in your tables and maintain their integrity and consistency.

What is data consistency when we talk about a database?

Let's take our online store with employee, product and task tables . We already know that there can be tasks in the task table that are not assigned to anyone: the employee_id of such rows is NULL.

But what happens if there is an entry in the task table with an employee_id equal to, say, 115? After all, we do not have such an employee. We don't have an employee with id = 115 in the employee table. At the same time, a link to an employee with this id is in the task table. This is an example of data inconsistency .

So how do we reconcile these data? Ideally, it would be so that the SQL server, with any data change, controlled all these nuances. And there is such an opportunity, it is called FOREIGN_KEY.

If some column in your table contains not just numbers, but id rows from another table, then this can be specified explicitly.

Adding a FOREIGN KEY

Such a key can be added to the table both at the stage of its creation, and after, using ALTER TABLE. The format is not fundamentally different. We will present both options.

The general form of such a key/rule is:

FOREIGN KEY (column)
  	REFERENCES table(column)

Let's add this key/rule to the task table to ensure that all employee_ids from the table refer to an existing entry in the employee table. This script will look like this:

ALTER TABLE task
      ADD FOREIGN KEY (employee_id)
  	REFERENCES employee(id)

And if we decided to add this rule at the time of creating the task table, then the code would look like this:

CREATE TABLE task (
      id INT,
      name VARCHAR(100),
      employee_id INT,
      deadline DATE,
 
      PRIMARY KEY (id),
  	  FOREIGN KEY (employee_id)  
	      REFERENCES employee(id)
);

By the way, there are situations when the string we refer to has a unique composite key: for example, “Name and year of birth” or “productCatogoryId and productId”. Then FOREIGN KEY can be written like this:

FOREIGN KEY (our_column1, our_column2)
  	REFERENCES table(their_column1, their_column2)

FOREIGN KEY and changing data

Now imagine a situation where we decided to update some data in the employee table and our employee id has changed. What will happen to the data in the task table? That's right, they will become irrelevant, and the integrity of our database will be violated.

To prevent this from happening, you can tell SQL Server to change the employee_id of all rows in all tables that refer to this particular changed id when the id in the employee table changes.

Such scripts are called OnUpdate and OnDelete . What to do if the record id changes, and what to do if the record is deleted?

With the removal, not everything is so simple. If you have dependent objects represented by strings in the database that refer to each other, then a wide variety of behavior scenarios are possible when deleting one object.

Let's say we delete a site user, which means we must delete all his personal correspondence. But it is unlikely that we should remove all of his public comments.

Or an employee quits. It would be strange if he quit and at the same time all the tasks assigned to him disappeared from the database. But if they had remained appointed not by him, it would also have turned out badly. It is more correct to make it so that the employee can quit after reassigning all his tasks to other people.

Here is how we can describe these scenarios using FOREIGN KEY. The general form of such a key/rule is:

FOREIGN KEY (column)
  	REFERENCES table(column)
 	[ON DELETE reference_option]
 	[ON UPDATE reference_option]

What to do in case of deleting (ON DELETE) or changing (ON UPDATE) records? In total, there can be 5 options for the SQL server to act in each of these situations:

#	reference_option	Explanation
1	RESTRICT	Disable action if string references found
2	CASCADE	Change id in dependent rows
3	SET NULL	Set id in dependent rows to NULL
4	NO ACTION	Nothing to do
5	SET DEFAULT x	Set id in dependent sinks to x

Here's how we could modify our task table:

ALTER TABLE task
  	ADD FOREIGN KEY (employee_id)
  	REFERENCES employee(id)
  	ON UPDATE CASCADE
  	ON DELETE RESTRICT;

What is written here:

ON UPDATE CASCADE : If the id key in the employee table changes, then also change the employee_id in the task table that references it.

ON DELETE RESTRICT : If a row is being deleted from the employee table and it is referenced from the task table, then prevent the row from being deleted from the employee table.