Memory hardware architecture
Modern memory hardware architecture differs from Java's internal memory model. Therefore, you need to understand the hardware architecture in order to know how the Java model works with it. This section describes the general memory hardware architecture, and the next section describes how Java works with it.
Here is a simplified diagram of the hardware architecture of a modern computer:
In the modern world, a computer has 2 or more processors and this is already the norm. Some of these processors may also have multiple cores. On such computers, it is possible to run multiple threads at the same time. Each processor core is capable of executing one thread at any given time. This means that any Java application is a priori multi-threaded, and within your program, one thread per processor core can be running at a time.
The processor core contains a set of registers that reside in its memory (inside the core). It performs operations on register data much faster than on data that resides in the computer's main memory (RAM). This is because the processor can access these registers much faster.
Each CPU can also have its own cache layer. Most modern processors have it. The processor can access its cache much faster than main memory, but not as fast as its internal registers. The value of the cache access speed is approximately between the access speeds of the main memory and internal registers.
Moreover, processors have a place to have a multi-level cache. But this is not so important to know in order to understand how the Java memory model interacts with hardware memory. It is important to know that processors may have some level of cache.
Any computer also contains RAM (main memory area) in the same way. All cores can access main memory. The main memory area is usually much larger than the cache memory of the processor cores.
At the moment when the processor needs access to the main memory, it reads part of it into its cache memory. It can also read some data from the cache into its internal registers and then perform operations on them. When the CPU needs to write the result back to main memory, it will flush the data from its internal register to cache, and at some point, to main memory.
Data stored in the cache is normally flushed back to main memory when the processor needs to store something else in the cache. The cache has the ability to clear its memory and write data at the same time. The processor does not need to read or write the full cache every time during an update. Usually the cache is updated in small blocks of memory, they are called "cache line". One or more "cache lines" may be read into cache memory, and one or more cache lines may be flushed back to main memory.
Combining Java memory model and memory hardware architecture
As already mentioned, the Java memory model and memory hardware architecture are different. The hardware architecture does not distinguish between thread stacks and heaps. On hardware, the thread stack and HEAP (heap) reside in main memory.
Parts of stacks and thread heaps may sometimes be present in caches and internal registers of the CPU. This is shown in the diagram:
When objects and variables can be stored in different areas of the computer's memory, certain problems can arise. Here are the two main ones:
- Visibility of the changes that the thread has made to shared variables.
- Race condition when reading, checking and writing shared variables.
Both of these issues will be explained below.
Visibility of Shared Objects
If two or more threads share an object without proper use of volatile declaration or synchronization, then changes to the shared object made by one thread may not be visible to other threads.
Imagine that a shared object is initially stored in main memory. A thread running on a CPU reads the shared object into the cache of the same CPU. There he makes changes to the object. Until the CPU's cache has been flushed to main memory, the modified version of the shared object is not visible to threads running on other CPUs. Thus, each thread can get its own copy of the shared object, each copy will be in a separate CPU cache.
The following diagram illustrates an outline of this situation. One thread running on the left CPU copies the shared object into its cache and changes the value of count to 2. This change is invisible to other threads running on the right CPU because the update to count has not yet been flushed back to main memory.
To solve this problem, you can use the volatile keyword when declaring a variable. It can ensure that a given variable is read directly from main memory and is always written back to main memory when updated.
Race condition
If two or more threads share the same object and more than one thread updates variables in that shared object, then a race condition may occur.
Imagine that thread A reads the shared object's count variable into its processor's cache. Imagine also that thread B does the same thing, but in another processor's cache. Now thread A adds 1 to the value of count, and thread B does the same. Now the variable has been increased twice - separately by +1 in the cache of each processor.
If these increments were performed sequentially, the count variable would be doubled and written back to main memory (original value + 2).
However, two increments were performed at the same time without proper synchronization. Regardless of which thread (A or B) writes its updated version of count to main memory, the new value will only be 1 more than the original value, despite the two increments.
This diagram illustrates the occurrence of the race condition problem described above:
To solve this problem, you can use Java synchronized block. A synchronized block ensures that only one thread can enter a given critical section of code at any given time.
Synchronized blocks also guarantee that all variables accessed inside the synchronized block will be read from main memory, and when the thread exits the synchronized block, all updated variables will be flushed back to main memory, regardless of whether the variable is declared volatile or No.
GO TO FULL VERSION