Introduction to Data Types in SQL

Unlike JavaScript, SQL has strong typing. In every table, every column has its own fixed data type.

There are a lot of data types, but unlike the Java language, there are so many of them not because there are data types for all occasions. Databases are very dependent on the size of the data, so many data types differ from each other only in length.

In total, data types can be divided into 5 groups:

  • Numeric types
  • String types
  • Types for storing dates and times
  • Objects: usually represented as a collection of bytes
  • Transport : JSON and XML

Usually different DBMS still have their own data types. Each DBMS has its own specialization, so adding new data types is a very common thing.

Another thing is that just adding a new data type is not enough, you need to add functions that will work with it, as well as make this work convenient and fast.

If you work on any industrial (enterprise) DBMS, then most likely you will have to deal with its data types and its functions. For what it will be necessary to read 2-5 good books.

As part of our acquaintance with SQL, we will consider three main groups of data types:

  • Numbers
  • Strings
  • Dates

Numeric Types in SQL

In SQL, numeric types are divided into three groups:

  • Integer types
  • Fixed point numbers (fixed number of decimal places)
  • Floating point numbers

Let's start with integers . There are only 5 of them, and they can be described in one table:

# Type name Length in bytes Analogue from Java Minimum value Maximum value
1 TINYINT 1 byte -128 127
2 SMALLINT 2 short -32.768 32.767
3 MEDIUMINT 3 -8,388,608 8,388,607
4 INT 4 int -2,147,483,648 2,147,483,647
5 BIGINT 8 long -2 63 2 63-1

The data types are very similar to the Java data types, but there is also an integer type that is three bytes long. This is done to save on size.

Next come the floating point types , just like in Java, there are only two of them:

# Type name Length in bytes Analogue from Java Minimum value Maximum value
1 FLOAT 4 float -3.40E+38 +1.18E+38
2 DOUBLE 8 double -1.79E+308 +1.79E+308

Again, nothing new. Everything is the same as in Java. However, unlike Java, SQL has another special type, the fixed-point real number. It's called DECIMAL.

Typically this type is used to store amounts of money. When the name of this type is written, then after it it is usually indicated how many decimal places the number has before and after the decimal point. The general format looks like this:

DECIMAL(total_characters, after_comma)

And a small example:

salary DECIMAL(5,2)

This is how we described the fact that the salary column can contain integers (maximum 3 decimal places) and a fractional part - 2 decimal places.

The maximum number of characters that the DECIMAL type supports is 65.

String types in SQL

Rows in a database can be stored in two forms:

  • Fixed length strings
  • Strings with variable length

Fixed-length strings are specified as CHAR :

CHAR(length)

Fixed length means that all values ​​of this column will contain a strictly fixed number of characters.

An example of a fixed length string:

country_code CHAR(2)

Variable length strings are specified by the VARCHAR type :

VARCHAR(max_length)

Variable length means that all values ​​of this column will contain text of any length, but no more than the maximum length.

Variable length string example:

phone VARCHAR(12)

Fixed length strings have a very big advantage. If the client asked the SQL server to return to him the 1,000,000th row from the table, and the rows in the table are of fixed length, then, knowing the length of the row, you can easily calculate the bytes that relate to the desired row.

In the case of a variable length of lines, it will not be possible to quickly find the desired row in the table. Remember the speed of access to ArrayList and LinkedList, the situation is approximately the same here.

Let's compare how strings of different lengths will be stored in a table, depending on the data type.

Line CHAR(4) Byte to store VARCHAR(4) Byte to store
'' ' ' 4 '' 1
'ab' 'ab' 4 'ab' 3
'abcd' 'abcd' 4 'abcd' 5
'abcdefgh' 'abcd' 4 'abcd' 5

Note. The VARCHAR type requires one more byte for the same length, because it has to additionally store the length of the string.

Temporary types in SQL

SQL also has special types for storing dates and times. There are five types in total:

# Type name Analogue from Java DateTime API Example Minimum value Maximum value
1 DATE LocalDate '2022-06-30' '1000-01-01' '9999-12-31'
2 TIME LocalTime 'hh:mm:ss[.fraction]' '-838:59:59.000000' '838:59:59.000000'
3 DATETIME LocalDateTime '1000-01-01 00:00:00.000000' '9999-12-31 23:59:59.999999'
4 TIMESTAMP Date '1970-01-01 00:00:01.000000' '2038-01-19 03:14:07.999999'
5 YEAR 1901 2155

All data in queries is written as a string - in single quotes. The entry format goes from largest to smallest:

  • Year
  • Month
  • Day
  • Hour
  • Minute
  • Second
  • Fractions of a second

The DATE, TIME and DATETIME types can be conventionally considered analogues of the types from the Java DateTme API: LocalDate, LocalTime, LocalDateTime. The logic is about the same.

The TIMESTAMP type stores data in milliseconds since the beginning of 1970 (a UNIX operating system standard). It is in this form that they are stored by the Date type in the Java language.

And finally, there is the YEAR type, which is 1 byte long and stores values ​​from 1 to 255. Therefore, the range of years available to it is 1901-2155. The year 1900 cannot be stored by this type, since the value 0 is used to encode a NULL value.

Storing objects in SQL

There are special types for storing objects or texts of great length. We won’t go into detail on them, but I’ll list them anyway:

# Type name Explanation
1 TEXT Used to store long texts. When comparing and sorting on this field, only the first 100 characters are used.
2 BLOB The name stands for Byte Large Object. Stored as just a collection of bytes. Can be used to, for example, store pictures in a database.
3 CLOB The name stands for Char Large Object. Used to store long texts.
4 ENUM Allows you to set a fixed set of values ​​and store one of them as a value.
5 SET Allows you to set a fixed set of values ​​and store any subset of them as a value. Usually stores them as a binary mask.

Theoretically, you can serialize any Java object as a set of bytes and store it in the database as a BLOB. Saving the object is not a problem. How to work with him further?

Let's say a table stores a million objects in serialized form - how will you search through them? A DBMS only supports a certain type of data when it provides an extensive set of functions for working with it.