The core theoretical basis of Java concurrent programming learning notes

Author：Eve Cole Update Time：2025-04-01 16:48:02

Concurrent programming is one of the most important skills for Java programmers and one of the most difficult skills to master. It requires programmers to have a deep understanding of the lowest operating principles of the computer, and at the same time, it requires programmers to have clear logic and meticulous thinking, so that they can write efficient, safe and reliable multi-threaded concurrent programs. This series will start from the nature of inter-thread coordination (wait, notify, notifyAll), Synchronized and Volatile, and explain in detail each concurrency tool and underlying implementation mechanism provided by JDK. On this basis, we will further analyze the tool classes of the java.util.concurrent package, including its usage, implementation of source code and the principles behind it. This article is the first article in this series and is the most core theoretical part of this series. Subsequent articles will be analyzed and explained based on this.

1. Sharing

Data sharing is one of the main reasons for thread safety. If all the data is only valid in the thread, there is no thread safety issue, which is one of the main reasons why we often do not need to consider thread safety when programming. However, in multithreaded programming, data sharing is inevitable. The most typical scenario is the data in the database. In order to ensure the consistency of the data, we usually need to share the data in the same database. Even in the case of master and slave, the same data is accessed. The master and slave are just copying the same data for the efficiency of access and data security. We now demonstrate the problems caused by sharing data under multiple threads through a simple example:

Code Snippet 1:

 package com.paddx.test.concurrent;public class ShareData { public static int count = 0; public static void main(String[] args) { final ShareData data = new ShareData(); for (int i = 0; i < 10; i++) { new Thread(new Runnable() { @Override public void run() { try { //Pause for 1 millisecond when entering to increase the chance of concurrency problems Thread.sleep(1); } catch (InterruptedException e) { e.printStackTrace(); } for (int j = 0; j < 100; j++) { data.addCount(); } System.out.print(count + " "); } }).start(); } try { //The main program is paused for 3 seconds to ensure that the above program execution is completed Thread.sleep(3000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("count=" + count); } public void addCount() { count++; }}

The purpose of the above code is to add one operation to count and execute 1,000 times, but here is implemented through 10 threads, each thread executes 100 times, and under normal circumstances, 1,000 should be output. However, if you run the above program, you will find that the result is not the case. Here is the execution result of a certain time (the results of each run may not be the same, and sometimes the correct result may be obtained):

It can be seen that for shared variable operations, various unexpected results are easily seen in a multi-threaded environment.

2. Mutual Exclusion

Resource mutual exclusion means that only one visitor is allowed to access it at the same time, which is unique and exclusive. We usually allow multiple threads to read data at the same time, but only one thread can write data at the same time. So we usually divide locks into shared locks and exclusive locks, also called read locks and write locks. If resources are not mutually exclusive, we don't need to worry about thread safety even if they are shared resources. For example, for immutable data sharing, all threads can only read it, so thread safety issues are not necessary. However, writing operations for shared data generally require mutual exclusion. In the above example, data modification problems occur because of the lack of mutual exclusion. Java provides multiple mechanisms to ensure mutual exclusion, the easiest way is to use Synchronized. Now we add Synchronized to the above program and execute:

Code Snippet Two:

 package com.paddx.test.concurrent;public class ShareData { public static int count = 0; public static void main(String[] args) { final ShareData data = new ShareData(); for (int i = 0; i < 10; i++) { new Thread(new Runnable() { @Override public void run() { try { //Pause for 1 millisecond when entering to increase the chance of concurrency problems Thread.sleep(1); } catch (InterruptedException e) { e.printStackTrace(); } for (int j = 0; j < 100; j++) { data.addCount(); } System.out.print(count + " "); } }).start(); } try { //The main program is paused for 3 seconds to ensure that the above program execution is completed Thread.sleep(3000); } catch (InterruptedException e) { e.printStackTrace(); } System.out.println("count=" + count); } /** * Add synchronized keyword*/ public synchronized void addCount() { count++; }}

Now that the above code is executed, you will find that no matter how many times you execute, the final result will be 1000.

III. Atomicity

Atomicity refers to the operation of data as an independent and indivisible whole. In other words, it is an operation that is continuous and uninterruptible. Half of the data execution is not modified by other threads. The easiest way to ensure atomicity is operating system instructions, that is, if one operation corresponds to one operating system instruction at a time, it will definitely ensure atomicity. However, many operations cannot be completed with one instruction. For example, for long-type operations, many systems need to be divided into multiple instructions to operate on the high and low positions respectively. For example, the operation of the integer i++ that we often use actually needs to be divided into three steps: (1) Read the value of the integer i; (2) Add one operation to i; (3) Write the result back to memory. This process may occur in multithreading:

This is also the reason why the result of code segment execution is incorrect. For this combination operation, the most common way to ensure atomicity is to lock, such as Synchronized or Lock in Java can be implemented, and the code segment 2 is implemented through Synchronized. In addition to locks, there is another way to CAS (Compare And Swap), that is, before modifying the data, compare whether the values read before the previous ones are consistent. If they are consistent, modify them, and if they are inconsistent, they will be executed again. This is also the principle of optimizing lock implementation. However, CAS may not be effective in some scenarios. For example, another thread first modifies a certain value and then changes it back to the original value. In this case, CAS cannot judge.

4. Visibility

To understand visibility, you need to have a certain understanding of the JVM's memory model. The JVM's memory model is similar to the operating system, as shown in the figure:

From this figure, we can see that each thread has its own working memory (equivalent to the CPU advanced buffer. The purpose of this is to further narrow the speed difference between the storage system and the CPU and improve performance). For shared variables, each time the thread reads a copy of the shared variable in the working memory. When writing, it directly modifies the value of the copy in the working memory, and then synchronizes the working memory with the value in the main memory at a certain point in time. The problem this causes is that if thread 1 modifies a certain variable, thread 2 may not see the modifications made by thread 1 to the shared variable. Through the following program, we can demonstrate the invisible problem:

 package com.paddx.test.concurrent;public class VisibilityTest { private static boolean ready; private static int number; private static class ReaderThread extends Thread { public void run() { try { Thread.sleep(10); } catch (InterruptedException e) { e.printStackTrace(); } if (!ready) { System.out.println(ready); } System.out.println(number); } } private static class WriterThread extends Thread { public void run() { try { Thread.sleep(10); } catch (InterruptedException e) { e.printStackTrace(); } number = 100; ready = true; } } public static void main(String[] args) { new WriterThread().start(); new ReaderThread().start(); }}

Intuitively, this program should only output 100, and the ready value will not be printed. In fact, if you execute the above code multiple times, there may be many different results. Here are the results of some two runs:

Of course, this result can only be said to be possible due to visibility. When the write thread (WriterThread) is set ready=true, the readerThread cannot see the modified result, so false will be printed. For the second result, that is, the result of the write thread has not been read when executing if (!ready), but the result of the write thread execution is read when executing System.out.println(ready). However, this result may also be caused by alternate execution of threads. Visibility can be ensured through Synchronized or Volatile in Java, and the specific details will be analyzed in subsequent articles.

5. Sequence

To improve performance, the compiler and processor may reorder instructions. There are three types of reordering:

(1) Compiler-optimized reordering. The compiler can reschedule the execution order of statements without changing the semantics of a single-threaded program.

(2) Reordering of instruction-level parallelism. Modern processors use instruction-level parallel technology (ICP) to overlap execution of multiple instructions. If there is no data dependency, the processor can change the execution order of the statement corresponding to the machine instructions.
(3) Reordering of memory system. Since the processor uses cache and read/write buffers, this makes loading and storage operations appear to be executed out of order.

We can directly refer to the description of reordering problems in JSR 133:

(1) (2)

Let’s first look at the source code part (1) in the picture above. From the source code, either instruction 1 is executed first or instruction 3 is executed first. If instruction 1 is executed first, r2 should not see the value written in instruction 4. If instruction 3 is executed first, r1 should not see the value written by instruction 2. However, the running result may have r2==2 and r1==1, which is the result of "reordering". The above figure (2) is a possible legal compilation result. After compilation, the order of instruction 1 and instruction 2 may be interchanged. Therefore, the result of r2==2 and r1==1 will appear. Synchronized or Volatile can also be used in Java to ensure order.

Six summary

This article explains the theoretical basis of Java concurrent programming, and some things will be discussed in more detail in subsequent analysis, such as visibility, order, etc. Subsequent articles will be discussed based on the content of this chapter. If you can understand the above content well, I believe that it will be of great help to you whether it is to understand other concurrent programming articles or in your daily concurrent programming work.