Since work, more and more code has been written, the program has become more and more bloated, and the efficiency has become less and less. This is absolutely not allowed for a programmer like me who pursues perfection. Therefore, in addition to constantly optimizing the program structure, memory optimization and performance tuning have become my usual "tricks".
To optimize and tune Java programs memory and performance, it is definitely not possible to not understand the internal principles of virtual machines (or more rigorous specifications). Here is a good book "In-depth Java Virtual Machine (Second Edition)" (by Bill Venners, translated by Cao Xiaogang and Jiang Jing. In fact, this article is the author's personal understanding of Java virtual machines after reading this book). Of course, the benefits of understanding Java virtual machines are not limited to the above two benefits. From a deeper technical perspective, understanding the specifications and implementation of Java virtual machines will be more helpful for us to write efficient and stable Java code. For example, if we understand the memory model of the Java virtual machine and the memory recycling mechanism of the virtual machine, we will not rely too much on it, but will explicitly "release memory" when needed (Java code cannot explicitly release memory, but we can inform the garbage collector that the object needs to be recycled by releasing the object reference), so as to reduce unnecessary memory consumption; if we understand how the Java stack works, we can reduce the risk of stack overflow by reducing the number of recursive layers and the number of loops. For application developers, they may not directly involve the work of the underlying implementation of these Java virtual machines, but understanding this background knowledge will more or less have a subtle and good impact on the programs we write.
This article will briefly explain the architecture and memory model of the Java virtual machine. If there are any inappropriate words or inaccurate explanations, please make sure to correct them. I am very honored!
Java Virtual Machine Architecture
Class loading subsystem
There are two class loaders for Java virtual machines, namely the startup class loader and the user-defined loader.
The class loading subsystem loads Class into the runtime data area through the fully qualified name of the class (package name and class name, network mount also includes URL). For each type that is loaded, the Java virtual machine creates an instance of the java.lang.Class class to represent the type, which is placed in the heap area in memory, and the loaded type information is located in the method area, which is the same as all other objects.
Before loading a type, the class loading subsystem must not only locate and import the corresponding binary class file, but also verify the correctness of the imported class, allocate and initialize memory for class variables, and parse symbol references as direct references. These actions are strictly in the following order:
1) Loading - find and load binary data of type;
2) Connection - Perform verification, preparation and parsing (optional)
3) Verify to ensure the correctness of the imported type
4) Prepare to allocate memory for class variables and initialize them to default values
5) Analyze the symbolic reference in the type to direct application
Method area
For each type loaded by the class loading subsystem, the virtual machine saves the following data to the method area:
1. Fully qualified name of type
2. Fully qualified name of type superclass (java.lang.Object does not have superclass)
3. Is the type a class type or an interface type
4. Type access modifier
5. Fully qualified name ordered list of any direct hyperinterface
In addition to the above basic type information, the following information will also be saved:
6. Type constant pool
7. Field information (including field name, field type, field modifier)
8. Method information (including method name, return type, number and type of parameters, method modifiers. If the method is not abstract and local, the method bytecode, operand stack and the size and exception table of the local variable area in the method stack frame will also be saved)
9. All class variables except constants (actually, they are static variables of the class. Because static variables are shared by all instances and are directly related to the type, they are class-level variables and are saved in the method area as members of the class)
10. A reference to ClassLoader
//The returned is the ClassLoader reference String.class.getClassLoader() that was saved just now; a reference to the Class class //It will return the reference String.class of the Class class just saved just now;
Note that the method area can also be recycled by the garbage collector.
heap
All class instances or arrays created by Java programs at runtime are placed in the same heap, and each Java virtual machine also has a heap space, and all threads share a heap (this is why a multi-threaded Java program will cause synchronization problems in object access).
Since each Java virtual machine has different implementations of the virtual machine specification, we may not know what form each Java virtual machine represents object instances in the heap, but we can get a glimpse through the following possible implementations:
Program Counter
For running Java programs, each thread has its own PC (program counter) register, which is created when the thread starts, with a size of one word, and is used to save the location of the next line of code that needs to be executed.
Java Stack
Each thread has a Java stack, which saves the running state of the thread in units of stack frames. There are two types of operations of virtual machines on the Java stack: stack pressing and stacking, both of which have frames. The stack frame saves data such as incoming parameters, local variables, intermediate operation results, etc., which are popped up when the method is completed and then released.
Take a look at the memory snapshot of the stack frame when two local variables are added together
Local method stack
This is where Java calls the operating system local library, used to implement JNI (Java Native Interface, Java local interface)
Execution Engine
The core of the Java virtual machine controls loading Java bytecode and parsing; for running Java programs, each thread is an instance of an independent virtual machine execution engine. From the beginning to the end of the thread life cycle, it is either executing bytecode or executing local methods.
Local interface
Connected to the local method stack and operating system library.
Note: All the places mentioned in the article refer to "Java virtual machine specifications for JavaEE and JavaSE platforms".
Virtual machine memory optimization practice
Since memory is mentioned, memory leaks have to be mentioned. As we all know, Java developed from the basis of C++, and a big problem with C++ programs is that memory leaks are difficult to solve. Although Java's JVM has its own garbage collection mechanism to recycle memory, in many cases, Java program developers do not need to worry too much, but there are also leak problems, which are just a little smaller than C++. For example, there is a referenced but useless object in the program: if the program references the object, but will not or cannot use it in the future, then the memory space it takes up is wasted.
Let’s first look at how GC works: monitor the running status of each object, including the application, citation, citation, assignment, etc. When the object is no longer cited, release the object (the focus of GC this article will not be explained too much). Many Java programmers rely too much on GC, but the key to the problem is that no matter how good the JVM's garbage collection mechanism is, memory is always a limited resource. Therefore, even if GC will complete most of the garbage collection for us, it is still necessary to pay attention to memory optimization during the encoding process appropriately. This can effectively reduce the number of GCs, while improving memory utilization, and maximizing program efficiency.
Overall, the memory optimization of Java virtual machines should start from two aspects: Java virtual machines and Java applications. The former refers to controlling the size of the virtual machine logical memory partition through virtual machine parameters according to the design of the application so that the virtual machine's memory complements the program's memory requirements; the latter refers to optimizing program algorithms, reducing GC burden, and improving the success rate of GC recycling.
The parameters for optimizing virtual machine memory through parameters are as follows:
Xms
Initial Heap Size
Xmx
java heap maximum value
1mn
Heap size of young generation
Xss
Stack size for each thread
The above are three more commonly used parameters, some:
XX:MinHeapFreeRatio=40
Minimum percentage of heap free after GC to avoid expansion.
XX:MaxHeapFreeRatio=70
Maximum percentage of heap free after GC to avoid shrinking.
XX:NewRatio=2
Ratio of new/old generation sizes. [Sparc -client:8; x86 -server:8; x86 -client:12.]-client:8 (1.3.1+), x86:12]
XX:NewSize=2.125m
Default size of new generation (in bytes) [5.0 and newer: 64 bit VMs are scaled 30% larger; x86:1m; x86, 5.0 and older: 640k]
XX:MaxNewSize=
Maximum size of new generation (in bytes). Since 1.4, MaxNewSize is computed as a function of NewRatio.
XX:SurvivorRatio=25
Ratio of eden/survivor space size [Sparc in 1.3.1: 25; other Solaris platforms in 5.0 and earlier: 32]
XX:PermSize=
Initial size of permanent generation
XX:MaxPermSize=64m
Size of the Permanent Generation. [5.0 and newer: 64 bit VMs are scaled 30% larger; 1.4 amd64: 96m; 1.3.1 -client: 32m.]
What is mentioned below to improve memory utilization and reduce memory risks by optimizing program algorithms is entirely empirical and is for reference only. If there is any inappropriateness, please correct me, thank you!
1. Release the reference of useless objects as soon as possible (XX = null;)
Look at a piece of code:
public List<PageData> parse(HtmlPage page) { List<PageData> list = null; try { List valueList = page.getByXPath(config.getContentXpath()); if (valueList == null || valueList.isEmpty()) { return list; } //Create an object when needed, save memory and improve efficiency list = new ArrayList<PageData>(); PageData pageData = new PageData(); StringBuilder value = new StringBuilder(); for (int i = 0; i < valueList.size(); i++) { HtmlElement content = (HtmlElement) valueList.get(i); DomNodeList<HtmlElement> imgs = content.getElementsByTagName("img"); if (imgs != null && !imgs.isEmpty()) { for (HtmlElement img : imgs) { try { HtmlImage image = (HtmlImage) img; String path = image.getSrcAttribute(); String format = path.substring(path.lastIndexOf("."), path.length()); String localPath = "D:/images/" + MD5Helper.md5(path).replace("//", ",").replace("/", ",") + format; File localFile = new File(localPath); if (!localFile.exists()) { localFile.createNewFile(); image.saveAs(localFile); } image.setAttribute("src", "file:////" + localPath); localFile = null; image = null; img = null; } catch (Exception e) { } } //This object will not be used in the future. Clearing the reference to it is equivalent to telling GC in advance. The object can recycle imgs = null; } String text = content.asXml(); value.append(text).append("<br/>"); valueList=null; content = null; text = null; } pageData.setContent(value.toString()); pageData.setCharset(page.getPageEncoding()); list.add(pageData); //The pageData=null; is useless because the list still holds the reference to the object, and GC will not recycle it value=null; //There is no list=null here; because list is the return value of the method, otherwise the return value you get from the method will always be empty, and this kind of error is not easy to be discovered or excluded} catch (Exception e) { } return list; }2. Use collection data types carefully, such as arrays, trees, graphs, linked lists and other data structures. These data structures are more complicated to recycle for GC.
3. Avoid explicitly applying for array space. When you have to explicitly apply, try to estimate its reasonable value as accurately as possible.
4. Try to avoid creating and initializing a large number of objects in the default constructor of the class, and prevent unnecessary waste of memory resources when calling its own constructor of the class.
5. Try to avoid forced system to recycle garbage memory, and increase the final time of garbage recycling in the system
6. Try to use instant value variables when developing remote method call applications, unless the remote caller needs to obtain the value of the instant value variable.
7. Try to use object pooling technology in appropriate scenarios to improve system performance