Java High Concurrency Three: Detailed explanation of Java memory model and thread safety

Author：Eve Cole Update Time：2025-04-29 13:00:05

Many online materials describe the Java memory model, which will introduce that there is a main memory, and each worker thread has its own working memory. There will be one piece of data in the main memory and one piece in the working memory. There will be various atomic operations between working memory and main memory to synchronize.

The following picture is from this blog

However, due to the continuous evolution of the Java version, the memory model has also changed. This article only talks about some features of the Java memory model. Whether it is a new memory model or an old memory model, it will look clearer after understanding these features.

1. Atomicity

Atomicity means that an operation is uninterruptible. Even when multiple threads are executed together, once an operation is started, it will not be disturbed by other threads.
It is generally believed that the instructions of CPU are atomic operations, but the code we write is not necessarily atomic operations.

For example, i++. This operation is not an atomic operation, it is basically divided into 3 operations, read i, perform +1, and assign value to i.

Suppose there are two threads. When the first thread reads i=1, the +1 operation has not been performed yet, and switch to the second thread. At this time, the second thread also reads i=1. Then the two threads perform subsequent +1 operations and then assign values back, i is not 3, but 2. Obviously there is inconsistency in the data.

For example, reading a 64-bit long value on a 32-bit JVM is not an atomic operation. Of course, 32-bit JVM reads 32-bit integers as an atomic operation.

2. Order

During concurrency, the execution of the program may be out of order.

When a computer executes code, it does not necessarily execute in the order of the program.

 class OrderExample { int a = 0; boolean flag = false; public void writer() { a = 1; flag = true; } public void reader() { if (flag) { int i = a +1; } } }

For example, in the above code, two methods are called by two threads respectively. According to common sense, the writing thread should first execute a=1, and then execute flag=true. When the read thread is reading, i=2;

But because a=1 and flag=true, there is no logical correlation. Therefore, it is possible to reverse the order of execution, and it is possible to execute flag=true first, and then a=1. At this time, when flag=true, switch to the read thread. At this time, a=1 has not been executed yet, then the read thread will be i=1.

Of course this is not absolute. It is possible that there will be out of order and may not happen.

So why is there out of order? This starts with the CPU instruction. After the code in Java is compiled, it is finally converted into assembly code.

The execution of an instruction can be divided into many steps. Assuming that the CPU instruction is divided into the following steps

Take IF
Decoding and fetching register operand ID
Execute or valid address calculation EX
Memory access MEM
Write back to WB

Suppose there are two instructions here

Generally speaking, we will think that the instructions are executed serially, first execute instruction 1, and then execute instruction 2. Assuming that each step requires 1 CPU time period, then executing these two instructions requires 10 CPU time periods, which is too inefficient to do so. In fact, instructions are executed in parallel. Of course, when the first instruction executes IF, the second instruction cannot perform IF because the instruction registers and the like cannot be occupied at the same time. So as shown in the figure above, the two instructions are executed in parallel in a relatively staggered way. When instruction 1 executes the ID, instruction 2 executes the IF. In this way, two instructions were executed in only 6 CPU time periods, which was relatively efficient.

According to this idea, let’s take a look at how the A=B+C instruction is executed.

As shown in the figure, there is an idle (X) operation during ADD operation, because when you want to add B and C, when ADD's X operation in the figure, C has not read from memory (C only read from memory when the MEM operation is completed. There is a question here. At this time, there is no write back (WB) to R2, how can R1 and R1 be added? That is because in the hardware circuit, a technology called "bypass" is used to directly read the data from the hardware, so there is no need to wait for the WB to execute before ADD is performed). Therefore, there will be an idle (X) time in ADD operation. In SW operation, since the EX instruction cannot be performed simultaneously with the ADD EX instruction, there will be an idle (X) time.

Next, let's give a slightly more complicated example

a=b+c
d=ef

The corresponding instructions are as follows

The reason is similar to the above, so I won't analyze it here. We found that there are a lot of X here, and there are a lot of time cycles wasted, and performance is also affected. Is there a way to reduce the number of Xs?

We hope to use some operations to fill the free time of X, because ADD has data dependence with the above instructions, and we hope to use some instructions without data dependence to fill the free time generated by data dependence.

We changed the order of the instructions

After changing the order of instructions, X is eliminated. The overall run time period has also decreased.

Instruction reordering can make pipeline smoother

Of course, the principle of instruction rearrangement is that it cannot destroy the semantics of the serial program. For example, a=1, b=a+1, such instructions will not be rearranged because the serial result of the rearrangement is different from the original one.

Instruction rearrangement is just a way to optimize the compiler or CPU, and this optimization has caused problems with the program at the beginning of this chapter.

How to solve it? Use the volatile keyword, this subsequent series will be introduced.

3. Visibility

Visibility refers to whether other threads can immediately know the modification when a thread modifies the value of a shared variable.

Visibility issues may arise in various links. For example, the instruction reordering just mentioned will also cause visibility problems, and in addition, the optimization of compiler or optimization of certain hardware will also cause visibility problems.

For example, a thread optimizes a shared value into memory, while another thread optimizes the shared value into the cache. When modifying the value in memory, the cached value does not know the modification.

For example, some hardware optimizations, when a program writes multiple times to the same address, it will think it is unnecessary and only keeps the last write, so the data written before will be invisible in other threads.

In short, most of the problems with visibility stem from optimization.

Next, let’s look at a visibility problem arising from the Java virtual machine level

The problem comes from a blog

 package edu.hushi.jvm; /** * * @author -10 * */public class VisibilityTest extends Thread { private boolean stop; public void run() { int i = 0; while(!stop) { i++; } System.out.println("finish loop,i=" + i); } public void stopIt() { stop = true; } public boolean getStop(){ return stop; } public static void main(String[] args) throws Exception { VisibilityTest v = new VisibilityTest(); v.start(); Thread.sleep(1000); v.stopIt(); Thread.sleep(2000); System.out.println("finish main"); System.out.println(v.getStop()); } }

The code is very simple. The v thread keeps i++ in the while loop until the main thread calls the stop method, changing the value of the stop variable in the v thread to stop the loop.
Problems occur when seemingly simple code runs. This program can stop threads from doing self-increment operations in client mode, but in server mode, it will be an infinite loop first. (More JVM optimization in server mode)

Most of the 64-bit systems are server mode, and run in server mode:

finish main
true

Only these two sentences will be printed, but the finish loop will not be printed. But you can find that the stop value is already true.
The author of this blog uses tools to restore the program to assembly code

Only a part of the assembly code is intercepted here, and the red part is the loop part. It can be clearly seen that only 0x0193bf9d is the stop verification, while the red part does not take the stop value, so an infinite loop is performed.

This is the result of JVM optimization. How to avoid it? Like instruction reordering, use the volatile keyword.

If volatile is added, restore it to assembly code and you will find that each loop will get the stop value.

Next, let’s take a look at some examples in the “Java Language Specification”

The above figure shows that instruction reordering will lead to different results.

The reason why r5=r2 is made in the figure above is that r2=r1.x, r5=r1.x, and it is directly optimized to r5=r2 at compile time. In the end, the results are different.

4. Happen-Before

Program order principle: Semantic seriality within a thread
volatile rules: The writing of volatile variables occurs first by reading, which ensures the visibility of volatile variables
Lock rules: unlocking must occur before subsequent locking
Transmission: A precedes B, B precedes C, then A must precedes C
The thread's start() method precedes every action
All operations of a thread precede the end of the thread (Thread.join())
The interrupt of a thread is preceded by the interrupted thread.
The object's constructor execution ends before the finalize() method
These principles ensure that the semantics of rearrangements are consistent.

5. The concept of thread safety

It refers to the fact that when a certain function or function library is called in a multi-threaded environment, it can correctly process local variables of each thread and enable the program functions to be completed correctly.

For example, the i++ example mentioned at the beginning

This will lead to thread insecure.

For details on thread safety, please refer to this blog I wrote before, or follow the subsequent series, and you will also talk about related content.