Course Module 4. Working with databases - Lecture: Problems of concurrent transactions

1.1 Introduction

And now the fun begins - the theory of how transactions work. How to keep the system working when you change the same data in different threads? Or do you want to execute one transaction in another? We will begin to look for answers to these questions by studying the isolation of transactions ...

The transaction isolation level is a conditional value that determines the extent to which, as a result of the execution of logically parallel transactions in the DBMS, inconsistent data is allowed. The scale of transaction isolation levels contains a number of values, ranked from lowest to highest; a higher isolation level corresponds to better data consistency, but its use may reduce the number of physically parallel transactions.

Conversely, a lower isolation level allows for more parallel transactions, but reduces data accuracy. Thus, choosing the level of transaction isolation used, the developer of the information system, to a certain extent, provides a choice between the speed of work and ensuring the guaranteed consistency of the data received from the system.

Problems of concurrent access using transactions

When transactions are executed in parallel, the following problems are possible:

lost update - if one data block is changed simultaneously by different transactions, all changes are lost, except for the last one;
"dirty" reading (eng. Dirty read) - reading data added or changed by a transaction, which subsequently will not be confirmed (rolled back);
non-repeatable read (eng. non-repeatable read) - when re-reading within the same transaction, previously read data is changed;
phantom reads - one transaction during its execution several times selects many rows according to the same criteria. Another transaction in between these fetches adds rows or modifies columns of some of the rows used in the first transaction's fetch criteria and ends successfully. As a result, it will turn out that the same selections in the first transaction give different sets of rows.

Consider situations in which these problems may occur.

1.2 Lost update

The situation when, when one data block is changed simultaneously by different transactions, one of the changes is lost.

Suppose there are two transactions running at the same time:

Transaction 1	Transaction 2
UPDATE tbl1 SET f2=f2+20 WHERE f1=1;	UPDATE tbl1 SET f2=f2+25 WHERE f1=1;

In both transactions, the value of the f2 field changes; upon completion, the value of the field must be increased by 45. In fact, the following sequence of actions may occur:

Both transactions simultaneously read the current state of the field. Exact physical concurrency is not required here, it is enough that the second read operation in order is completed before another transaction writes its result.
Both transactions calculate the new field value by adding 20 and 25, respectively, to the previously read value.
Transactions try to write the result of the calculation back to field f2. Since it is physically impossible to perform two writes at the same time, in reality one of the write operations will be performed earlier, the other later. The second write operation will overwrite the result of the first one.

As a result, the value of the f2 field, upon completion of both transactions, may increase not by 45, but by 20 or 25, that is, one of the data-changing transactions will “disappear”.

1.3 "Dirty" reading

Reading data added or modified by a transaction that will later fail to commit (rollback).

Suppose we have two transactions opened by different applications that execute the following SQL statements:

Transaction 1	Transaction 2
UPDATE tbl1 SET f2=f2+1 WHERE f1=1;
	SELECT f2 FROM tbl1 WHERE f1=1;
ROLLBACK WORK;

In transaction 1, the value of field f2 is changed, and then in transaction 2, the value of this field is selected. After that, transaction 1 is rolled back. As a result, the value received by the second transaction will differ from the value stored in the database.

1.4 Non-repeatable reading

The situation when, when re-reading within the same transaction, previously read data turns out to be changed.

Suppose we have two transactions opened by different applications that execute the following SQL statements:

Transaction 1	Transaction 2
	SELECT f2 FROM tbl1 WHERE f1=1;
UPDATE tbl1 SET f2=f2+3 WHERE f1=1;
COMMIT;
	SELECT f2 FROM tbl1 WHERE f1=1;

In transaction 2, the value of field f2 is selected, then in transaction 1, the value of field f2 is changed. If you try again to select a value from field f2 in transaction 2, a different result will be obtained. This situation is especially unacceptable when the data is read in order to partially modify it and write it back to the database.

1.5 Reading "phantoms"

The situation when, during repeated reading within the same transaction, the same selection gives different sets of rows.

Suppose there are two transactions opened by different applications that execute the following SQL statements:

Transaction 1	Transaction 2
	SELECT SUM(f2) FROM tbl1;
INSERT INTO tbl1 (f1,f2) VALUES(15,20);
COMMIT;
	SELECT SUM(f2) FROM tbl1;

Transaction 2 executes an SQL statement that uses all the values of field f2. Then a new row is inserted in transaction 1, causing the re-execution of the SQL statement in transaction 2 to produce a different result. This situation is called phantom reading (phantom reading). It differs from non-repeating reading in that the result of repeated access to data has changed not due to the change / deletion of the data itself, but due to the appearance of new (phantom) data.