1. Multi-threading introduction
In programming, we cannot avoid multi-threading programming problems, because concurrent processing is required in most business systems. If it is in concurrent scenarios, multi-threading is very important. In addition, during our interview, the interviewer usually asks us questions about multi-threading, such as: How to create a thread? We usually answer this way, there are two main methods. The first is: inherit the Thread class and rewrite the run method; the second is: implement the Runnable interface and rewrite the run method. Then the interviewer will definitely ask what the advantages and disadvantages of these two methods are. No matter what, we will come to a conclusion, that is, the second way of use, because object-oriented advocates less inheritance and try to use combinations as much as possible.
At this time, we may also think of what to do if we want to get the return value of multi-threads? Based on the knowledge we have learned more, we will think of implementing the Callable interface and rewriting the call method. How do so many threads be used in actual projects? How many ways do they have?
First, let's take a look at an example:
This is a simple method to create multi-threads, which is easy to understand. In the example, according to different business scenarios, we can pass different parameters into Thread() to implement different business logic. However, the problem exposed by this method of creating multi-threads is to create threads repeatedly, and it has to be destroyed after creating threads. If the requirements for concurrent scenarios are low, this method seems to be OK, but in high concurrency scenarios, this method is not possible, because creating threads is very resource-consuming. So according to experience, the correct way is to use thread pool technology. JDK provides a variety of thread pool types for us to choose from. For specific methods, you can check the jdk documentation.
What we need to note in this code is that the parameters passed in represent the number of threads we configured. Is the more the better? Surely not. Because when configuring the number of threads, we must fully consider the performance of the server. If there are more thread configurations, the performance of the server may not be excellent. Usually, the calculations completed by the machine are determined by the number of threads. When the number of threads reaches the peak, the calculation cannot be performed. If it is the business logic that consumes CPU (more calculations), the number of threads and cores will reach its peak. If it is the business logic that consumes I/O (operating databases, uploading files, downloading, etc.), the more threads, the more threads, it will help improve performance in a certain sense.
Another formula to set the number of threads:
Y=N*((a+b)/a), where N: the number of CPU cores, a: the calculation time of the program when the thread is executed, b: the blocking time of the program when the thread is executed. With this formula, the thread count configuration of the thread pool will be constrained, and we can flexibly configure it according to the actual situation of the machine.
2. Multithreaded optimization and performance comparison
The threading technology was used in recent projects, and I encountered a lot of trouble during use. Taking advantage of the popularity, I will sort out the performance comparisons of several multi-threaded frameworks. The ones we have mastered are roughly divided into three types: the first type: ThreadPool (thread pool) + CountDownLatch (program counter), the second type: Fork/Join framework, and the third type of JDK8 parallel stream. Here is a comparative summary of the multi-threading performance of these methods.
First, assume a business scenario where multiple file objects are generated in memory. Here, 30,000 thread sleep is tentatively determined to simulate business processing business logic to compare the multi-threading performance of these methods.
1) Single threaded
This method is very simple, but the program is very time-consuming during processing and will be used for a long time, because each thread is waiting for the current thread to be executed before it will be executed. It has little to do with multi-threads, so the efficiency is very low.
First create the file object, the code is as follows:
public class FileInfo { private String fileName;//File name private String fileType;//File type private String fileSize;//File size private String fileMD5;//MD5 code private String fileVersionNO;//File version number public FileInfo() { super(); } public FileInfo(String fileName, String fileType, String fileSize, String fileMD5, String fileVersionNO) { super(); this.fileName = fileName; this.fileType = fileType; this.fileSize = fileSize; this.fileMD5 = fileMD5; this.fileVersionNO = fileVersionNO; } public String getFileName() { return fileName; } public void setFileName(String fileName) { this.fileName = fileName; } public String getFileType() { return fileType; } public void setFileType(String fileType) { this.fileType = fileType; } public String getFileSize() { return fileSize; } public void setFileSize(String fileSize) { this.fileSize = fileSize; } public String getFileMD5() { return fileMD5; } public void setFileMD5(String fileMD5) { this.fileMD5 = fileMD5; } public String getFileVersionNO() { return fileVersionNO; } public void setFileVersionNO(String fileVersionNO) { this.fileVersionNO = fileVersionNO; }Then, simulate business processing, create 30,000 file objects, thread sleeps for 1ms, and sets 1000ms before, and finds that the time is very long, and the entire Eclipse is stuck, so change the time to 1ms.
public class Test { private static List<FileInfo> fileList= new ArrayList<FileInfo>(); public static void main(String[] args) throws InterruptedException { createFileInfo(); long startTime=System.currentTimeMillis(); for(FileInfo fi:fileList){ Thread.sleep(1); } long endTime=System.currentTimeMillis(); System.out.println("Single thread time-consuming: "+(endTime-startTime)+"ms"); } private static void createFileInfo(){ for(int i=0;i<30000;i++){ fileList.add(new FileInfo("Front photo of ID card","jpg","101522","md5"+i,"1")); } }}The test results are as follows:
It can be seen that generating 30,000 file objects takes a long time, nearly 1 minute, and the efficiency is relatively low.
2) ThreadPool (thread pool) +CountDownLatch (program counter)
As the name suggests, CountDownLatch is a thread counter. Its execution process is as follows: First, the await() method is called in the main thread, and the main thread is blocked, and then the program counter is passed to the thread object as a parameter. Finally, after each thread finishes executing the task, the countDown() method is called to indicate the completion of the task. After countDown() is executed multiple times, the main thread's await() will be invalid. The implementation process is as follows:
public class Test2 { private static ExecutorService executor=Executors.newFixedThreadPool(100); private static CountDownLatch countDownLatch=new CountDownLatch(100); private static List<FileInfo> fileList= new ArrayList<FileInfo>(); private static List<List<FileInfo>> list=new ArrayList<>(); public static void main(String[] args) throws InterruptedException { createFileInfo(); addList(); long startTime=System.currentTimeMillis(); int i=0; for(List<FileInfo> fi:list){ executor.submit(new FileRunnable(countDownLatch,fi,i)); i++; } countDownLatch.await(); long endTime=System.currentTimeMillis(); executor.shutdown(); System.out.println(i+" threads take time: "+(endTime-startTime)+"ms"); } private static void createFileInfo(){ for(int i=0;i<30000;i++){ fileList.add(new FileInfo("front ID card photo","jpg","101522","md5"+i,"1")); } } private static void addList(){ for(int i=0;i<100;i++){ list.add(fileList); } }}FileRunnable class:
/** * Multithreaded processing* @author wangsj * * @param <T> */public class FileRunnable<T> implements Runnable { private CountDownLatch countDownLatch; private List<T> list; private int i; public FileRunnable(CountDownLatch countDownLatch, List<T> list, int i) { super(); this.countDownLatch = countDownLatch; this.list = list; this.i = i; } @Override public void run() { for(T t:list){ try { Thread.sleep(1); } catch (InterruptedException e) { e.printStackTrace(); } countDownLatch.countDown(); } }}The test results are as follows:
3) Fork/Join framework
Jdk started with version 7, and the Fork/join framework appeared. From a literal perspective, fork is splitting and join is merger, so the idea of this framework is. Split the task through fork, and then join to merge the results after the split characters are executed and summarized. For example, we want to calculate several numbers that are added continuously, 2+4+5+7=? ,How do we use the Fork/join framework to complete it? The idea is to split the molecular tasks. We can split this operation into two subtasks, one calculates 2+4 and the other calculates 5+7. This is the process of Fork. After the calculation is completed, the results of the calculation of these two subtasks are summarized and the sum is obtained. This is the process of join.
Fork/Join framework execution idea: First, divide tasks and use the fork class to divide large tasks into several subtasks. This segmentation process needs to be determined according to the actual situation until the divided tasks are small enough. Then, the join class executes the task, and the divided subtasks are in different queues. Several threads obtain tasks from the queue and execute them. The execution results are placed in a separate queue. Finally, the thread is started, the results are obtained in the queue and the results are merged.
Several classes are used to use the Fork/Join framework. For the use of the class, you can refer to the JDK API. Using this framework, you need to inherit the ForkJoinTask class. Usually, you only need to inherit its subclass RecursiveTask or RecursiveAction. RecursiveTask is used for scenes with return results, and RecursiveAction is used for scenes with no return results. The execution of ForkJoinTask requires the execution of ForkJoinPool, which is used to maintain the divided subtasks added to different task queues.
Here is the implementation code:
public class Test3 { private static List<FileInfo> fileList= new ArrayList<FileInfo>();// private static ForkJoinPool forkJoinPool=new ForkJoinPool(100);// private static Job<FileInfo> job=new Job<>(fileList.size()/100, fileList); public static void main(String[] args) { createFileInfo(); long startTime=System.currentTimeMillis(); ForkJoinPool forkJoinPool=new ForkJoinPool(100); //Split the task Job<FileInfo> job=new Job<>(fileList.size()/100, fileList); //Submit the task and return the result ForkJoinTask<Integer> fjtResult=forkJoinPool.submit(job); //Block while(!job.isDone()){ System.out.println("Task completed!"); } long endTime=System.currentTimeMillis(); System.out.println("fork/join framework time-consuming: "+(endTime-startTime)+"ms"); } private static void createFileInfo(){ for(int i=0;i<30000;i++){ fileList.add(new FileInfo("front ID card photo","jpg","101522","md5"+i,"1")); } }}/** * Execute task class* @author wangsj * */public class Job<T> extends RecursiveTask<Integer> { private static final long serialVersionUID = 1L; private int count; private List<T> jobList; public Job(int count, List<T> jobList) { super(); this.count = count; this.jobList = jobList; } /** * Execute the task, similar to the run method that implements the Runnable interface*/ @Override protected Integer compute() { //Split the task if(jobList.size()<=count){ executeJob(); return jobList.size(); }else{ //Continue to create the task until it can be decomposed and executed List<RecursiveTask<Long>> fork = new LinkedList<RecursiveTask<Long>>(); //Split the nucleic task, here the dichotomy method is used int countJob=jobList.size()/2; List<T> leftList=jobList.subList(0, countJob); List<T> rightList=jobList.subList(countJob, jobList.size()); //Assign tasks Job leftJob=new Job<>(count,leftList); Job rightJob=new Job<>(count,rightList); //Execute the task leftJob.fork(); rightJob.fork(); return Integer.parseInt(leftJob.join().toString()) +Integer.parseInt(rightJob.join().toString()); } } /** * Execute the task method*/ private void executeJob() { for(T job:jobList){ try { Thread.sleep(1); } catch (InterruptedException e) { e.printStackTrace(); } } }The test results are as follows:
4) JDK8 Parallel Streaming
Parallel flow is one of the new features of jdk8. The idea is to turn a stream executed sequentially into a concurrent flow, which is implemented by calling the parallel() method. Parallel flow divides a stream into multiple data blocks, uses different threads to process the streams of different data blocks, and finally merges the processing results of each block of data stream, similar to the Fork/Join framework.
The parallel stream uses the public thread pool ForkJoinPool by default. The number of threads is the default value used. According to the number of cores of the machine, we can adjust the size of the threads appropriately. Adjusting the number of threads is achieved in the following ways.
System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "100");The following is the implementation process of the code, which is very simple:
public class Test4 {private static List<FileInfo> fileList= new ArrayList<FileInfo>(); public static void main(String[] args) {// System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "100"); createFileInfo(); long startTime=System.currentTimeMillis(); fileList.parallelStream().forEach(e ->{ try { Thread.sleep(1); } catch (InterruptedException f) { f.printStackTrace(); } }); long endTime=System.currentTimeMillis(); System.out.println("jdk8 parallel streaming time: "+(endTime-startTime)+"ms");}private static void createFileInfo(){ for(int i=0;i<30000;i++){ fileList.add(new FileInfo("front photo of ID card","jpg","101522","md5"+i,"1")); }}}The following is the test. The number of thread pools is not set for the first time. The default is used. The test results are as follows:
We saw that the result is not very ideal and takes a long time. Next, set the number of thread pools, that is, add the following code:
System.setProperty("java.util.concurrent.ForkJoinPool.common.parallelism", "100");Then the test was carried out, and the results were as follows:
This time it takes less time and is ideal.
3. Summary
To sum up the above situations, using a single thread as a reference, the longest time-consuming is the native Fork/Join framework. Although the number of thread pools is configured here, the JDK8 parallel stream with the number of thread pools is poorer. Parallel streaming implements the code is simple and easy to understand, and we don’t need to write extra for loops. We can complete all parallelStream methods, and the amount of code is greatly reduced. In fact, the underlying layer of parallel streaming is still the Fork/Join framework, which requires us to flexibly use various technologies during the development process to distinguish the advantages and disadvantages of various technologies, so as to better serve us.