1. Preface
java is a cross-hardware platform object-oriented high-level programming language. Java programs run on java virtual machines (JVMs) and manage memory by JVMs. This is the biggest difference from C++. Although memory is managed by JVMs, we must also understand how JVM manages memory. There are not only one JVM, and there may be dozens of virtual machines currently exist, but a virtual machine design that complies with the specification must follow the "java virtual machine specification". This article is based on the description of HotSpot virtual machine, and will be mentioned if there are differences with other virtual machines. This article mainly describes how memory is distributed in JVM, how objects of java program are stored and accessed, and possible exceptions in various memory areas.
2. Memory distribution (region) in JVM
When executing Java programs, the JVM divides the memory into multiple different data areas for management. These areas have different functions, creation and destruction times. Some areas are allocated when the JVM process is started, while others are related to the life cycle of the user thread (the thread of the program itself). According to the JVM specification, the memory areas managed by the JVM are divided into the following runtime data areas:
1. Virtual machine stack
This memory area is private by the thread and is created as the thread starts and destroyed when it is destroyed. The memory model for the execution of java methods described by the virtual machine stack: each method will create a stack frame (Stack Frame) at the beginning of execution, which is used to store local variable tables, operand stacks, dynamic links, method exits, etc. The execution and return of each method are completed, and there is a stack frame on the virtual machine stack.
As the name suggests, the local variable table is a memory area that stores local variables: it stores the basic data types (8 Java basic data types), reference types, and return addresses that can be found during the compiler period; the long and double types that occupy 64 bits will occupy 2 local variable space, and other data types only occupy 1; since the type size is determined and the number of variables can be known during the compilation period, the local variable table has a known size when it is created. This part of the memory space can be allocated during the compilation period, and there is no need to modify the local variable table size during the method run.
In the virtual machine specification, two exceptions are specified for this memory area:
1. If the stack depth requested by the thread is greater than the allowed depth (?), StackOverflowError exception will be thrown;
2. If the virtual machine can expand dynamically, when the expansion cannot apply for sufficient memory, an OutOfMemory exception will be thrown;
2. Local method stack
The local method stack is also thread-private, and its function is almost the same as the virtual machine stack: the virtual machine stack provides in-and-out stack services for Java method execution, while the local method stack provides services for the virtual machine to execute Native methods.
In the virtual machine specification, there is no mandatory regulation on the implementation method of the local method stack, and it can be implemented freely by the specific virtual machine; the HotSpot virtual machine directly combines the virtual machine stack and the local method stack into one; for other virtual machines to implement this method, readers can query relevant information if they are interested;
Like the virtual machine stack, the local method stack will also throw StackOverflowError和OutOfMemory exceptions.
3. Program Calculator
The program calculator is also a private memory area of threads. It can be considered as a line number indicator (pointing to an instruction) for threads to execute bytecode. When Java is executed, it obtains the next instruction to be executed by changing the value of the counter. The execution orders of branches, loops, jumps, exception handling, thread recovery, etc. all rely on this counter to complete. Multithreading of a virtual machine is achieved by switching in turn and allocating processor execution time. The processor (a core for a multi-core processor) can only execute one command at a time. Therefore, after the thread performs the switching, it needs to be restored to the correct execution position. Each thread has an independent program calculator.
When executing a java method, the program calculator records (points to) the address of the bytecode instruction that the current thread is executing. If the Native method is being executed, the value of this calculator is undefined. This is because the HotSpot virtual machine thread model is a native thread model, that is, each java thread directly maps the thread of the OS (operating system). When executing the Native method, it is directly executed by the OS. The value of this counter of the virtual machine is useless; since this calculator is a memory area with very small space, private, and does not require expansion. It is the only area in the virtual machine specification that does not specify any OutOfMemoryError exception.
4. Heap memory (Heap)
The java heap is a memory area shared by threads. It can be said that it is the largest memory area managed by the virtual machine and is created when the virtual machine is started. The java heap memory mainly stores object instances, and almost all object instances (including arrays) are stored here. Therefore, this is also the main memory area of garbage collection (GC). The content about GC will not be described here;
According to the virtual machine specification, Java heap memory can be in discontinuous physical memory. As long as it is logically continuous and there is no limit on space expansion, it can be either a fixed size or an extended tree. If the heap memory does not have enough space to complete the instance allocation and cannot be expanded, an OutOfMemoryError exception will be thrown.
5. Method area
The method area is like heap memory, and is a memory area shared by threads; it stores type information, constants, static variables, code compiled during the instant compilation period and other data that have been loaded by the virtual machine; the virtual machine specification does not have too many restrictions on the implementation of the method area, and like heap memory, does not require continuous physical memory space, the size can be fixed or scalable, and it can also be chosen not to implement garbage collection; when the method area cannot meet the memory allocation requirements, the OutOfMemoryError exception will be thrown.
6. Direct memory
Direct memory is not part of the virtual machine's managed memory, but this part of memory may still be used frequently; when Java programs use Native methods (such as NIO, NIO, no descriptions are given here), memory may be allocated directly off-heap, but the total memory space is limited, and there will be insufficient memory, and an OutOfMemoryError exception will also be thrown.
2. Instance object storage access
The first point above has a general description of the memory in each area of the virtual machine. For each area, there are problems with how data is created, laid out and accessed. Let’s use the most commonly used heap memory as an example to talk about these three aspects based on HotSpot.
1. Instance object creation
When the virtual machine executes a new instruction, first, it first locates the class symbol reference of the creation object from the constant pool, and judges whether the class has been loaded and initialized. If it is not loaded, the class load initialization process will be executed (the description will not be made here about class loading). If this class cannot be found, a common ClassNotFoundException exception will be thrown;
After class loading checking, physical memory (heap memory) is actually allocated to the object. The memory space required by the object is determined by the corresponding class. After class loading, the memory space required by the object of this class is fixed; allocating memory space for the object is equivalent to dividing a piece from the heap and allocating it to this object;
According to whether the memory space is continuous (allocated and unallocated are divided into two complete parts) it is divided into two ways to allocate memory:
1. Continuous memory: A pointer is used as a dividing point between allocated and unallocated memory. The object memory allocation only requires the pointer to move the space size to the unallocated memory segment; this method is called "pointer collision".
2. Discontinuous memory: The virtual machine needs to maintain (record) a list that records those memory blocks in the heap that are not allocated. When allocating the object memory, select a memory area of appropriate size to allocate it to the object, and update this list; this method is called "free list".
The allocation of object memory will also encounter concurrency problems. The virtual machine uses two solutions to solve this thread safety problem: first, use CAS (Compare and set)+ to identify and retry to ensure the atomicity of the allocation operation; second, memory allocation is divided into different spaces according to threads, that is, each thread pre-allocated a piece of thread-private memory in the heap, called the local thread-allocated buffer (TLAB); when that thread wants to allocate memory, it is directly allocated from the TLAB. Only when the thread's TLAB is allocated after re-allocating, can the synchronous operation be allocated from the heap. This solution effectively reduces the concurrency of object allocation heap memory between threads; whether the virtual machine uses TLAB is set through the JVM parameter -XX:+/-UseTLAB.
After completing memory allocation, in addition to object header information, the virtual machine initializes the allocated memory space to zero value to ensure that the fields of the object instance can be directly used to the zero value corresponding to the data type without assigning values; then, execute the init method to complete the initialization according to the code before the creation of an instance object is completed;
2. The layout of objects in memory
In the HotSpot virtual machine, objects are divided into three parts in memory: object header, instance data, and alignment and filling:
The object header is divided into two parts: part of it stores the object runtime data, including hash code, garbage collection generation age, object lock status, thread holding lock, biased thread ID, biased timestamp, etc.; in 32-bit and 64-bit virtual machines, this part of the data occupies 32-bit and 64-bit respectively; since there is a lot of runtime data, 32-bit or 64-bit is not enough to completely store all the data, so this part is designed to store runtime data in a non-fixed format, but uses different bits to store data according to the state of the object; the other part stores the object type pointer, pointing to the class of this object, but this is not necessary, and the object's class metadata does not necessarily need to be determined using this part of the storage (it will be discussed below);
Instance data is the content of various types of data defined by the object, and the data defined by these programs are not stored in the defined order. They are determined in the order of virtual machine allocation policies and definitions: long/double, int, short/char, byte/boolean, oop(Ordinary Object Ponint) . It can be seen that the policies are allocated according to the number of placeholders of the type, and the same types will allocate memory together; and, under the satisfaction of these conditions, the order of parent class variables is preceded by the subclass;
The object filling part does not necessarily exist. It only plays a role in placeholder alignment. In the HotSpot virtual machine memory management is managed in units of 8 bytes. Therefore, when the memory is allocated, the object size is not a multiple of 8, and the alignment filling is completed;
3. Object access <br />In the java program, we create an object, and in fact we get a reference type variable, through which we actually operate an instance in the heap memory; in the virtual machine specification, it is only stipulated that the reference type is a reference pointing to the object, and it does not specify how this reference locates and accesses the instances in the heap; currently, in mainstream virtual machines, there are two main ways to implement object access:
1. Handle method: A region is divided into heap memory as a handle pool. The reference variable stores the handle address of the object, and the handle stores the specific address information of the sample object and object type. Therefore, the object header can not contain the object type:
2. Direct access to pointer: The reference type directly stores the address information of the instance object in the heap, but this must require that the layout of the instance object must contain the object type:
These two access methods have their own advantages: when the object address is changed (memory sorting, garbage collection), the handle access object, the reference variable does not need to be changed, but only the object address value in the handle is changed; while using the pointer direct access method, all references of this object need to be modified; but the pointer method can reduce one addressing operation, and in the case of a large number of object accesses, the advantages of this method are more obvious; the HotSpot virtual machine uses this pointer direct access method.
3. Runtime memory exception
There are two main exceptions that may occur when running in the Java program: OutOfMemoryError and StackOverflowError; what will happen in that memory area? As mentioned briefly before, except for the program counter, other memory areas will occur; this section mainly demonstrates the exceptions in each memory area through instance code, and many commonly used virtual machine startup parameters will be used to better explain the situation. (How to run the program with parameters is not described here)
1. Java heap memory overflow
Heap memory overflow occurs when objects are created after the heap capacity reaches the maximum heap capacity. In the program, objects are created continuously and these objects are guaranteed not to be garbage collected:
/** * Virtual machine parameters: * -Xms20m Minimum heap capacity* -Xmx20m Maximum heap capacity* @author hwz * */public class HeadOutOfMemoryError { public static void main(String[] args) { //Use container to save the object to ensure that the object is not garbage collected List<HeadOutOfMemoryError> listToHoldObj = new ArrayList<HeadOutOfMemoryError>(); while(true) { //Continuously create objects and add them to the container listToHoldObj.add(new HeadOutOfMemoryError()); } }} You can add virtual machine parameters :-XX:HeapDumpOnOutOfMemoryError . When sending an OOM exception, let the virtual machine dump the snapshot file of the current heap. You can use this file word segmentation exception problem in the future. This will not be described in detail. I will write a blog to describe in detail using the MAT tool to analyze memory problems.
2. Virtual machine stack and local method stack overflow
In the HotSpot virtual machine, these two method stacks are not implemented together. According to the virtual machine specification, these two exceptions will occur in these two memory areas:
1. If the thread requests the stack depth greater than the maximum depth allowed by the virtual machine, throw a StackOverflowError exception;
2. If the virtual machine cannot apply for large memory space when expanding the stack space, an OutOfMemoryError exception will be thrown;
There is actually overlap between these two situations: when the stack space cannot be allocated, is it impossible to distinguish whether the memory is too small or the used stack depth is too large.
Use two ways to test the code
1. Use the -Xss parameter to reduce the stack size, call a method infinitely recursively, and increase the stack depth infinitely:
/** * Virtual machine parameters:<br> * -Xss128k stack capacity* @author hwz * */public class StackOverflowError { private int stackDeep = 1; /** * Infinite recursion, infinitely enlarge the call stack depth*/ public void recursiveInvoke() { stackDeep++; recursiveInvoke(); } public static void main(String[] args) { StackOverflowError soe = new StackOverflowError(); try { soe.recursiveInvoke(); } catch (Throwable e) { System.out.println("stack deep = " + soe.stackDeep); throw e; } }} A large number of local variables are defined in the method, the length of the local variable table in the method stack is also called infinitely recursively:
/** * @author hwz * */public class StackOOMEError { private int stackDeep = 1; /** * Define a large number of local variables, increase the local variable table in the stack* Infinite recursion, infinitely increase the depth of the call stack*/ public void recursiveInvoke() { Double i; Double i2; //........The large number of variable definitions are omitted here stackDeep++; recursiveInvoke(); } public static void main(String[] args) { StackOOMEError soe = new StackOOMEError(); try { soe.recursiveInvoke(); } catch (Throwable e) { System.out.println("stack deep = " + soe.stackDeep); throw e; } }}The above code test shows that no matter whether the frame stack is too large or the virtual machine capacity is too small, when the memory cannot be allocated, all StackOverflowError is thrown;
3. Method area and runtime constant pool overflow
Here we will first describe the intern method of String: if the string constant pool already contains a string equal to this String object, it will return a String object representing this string. Otherwise, add this String object to the constant pool and return a reference to this String object; through this method, it will continuously add a String object to the constant pool, resulting in overflow:
/** * Virtual machine parameters: <br> * -XX:PermSize=10M Permanent zone size* -XX:MaxPermSize=10M Permanent zone maximum capacity* @author hwz * */public class RuntimeConstancePoolOOM { public static void main(String[] args) { //Use container to save the object to ensure that the object is not garbage collected List<String> list = new ArrayList<String>(); //Use the String.intern method to add the object of the constant pool for (int i=1; true; i++) { list.add(String.valueOf(i).intern()); } }}However, this test code does not overflow during runtime constant pool in JDK1.7, but it will happen in JDK1.6. For this reason, write another test code to verify this problem:
/** * String.intern method is tested under different JDKs* @author hwz * */public class StringInternTest { public static void main(String[] args) { String str1 = new StringBuilder("test").append("01").toString(); System.out.println(str1.intern() == str1); String str2 = new StringBuilder("test").append("02").toString(); System.out.println(str2.intern() == str2); }} The results of running under JDK1.6 are: false, false;
The result of running under JDK1.7 is: true, true;
It turns out that in JDK1.6, the intern() method copies the string instance encountered for the first time to the permanent generation, which in turn is a reference to the instance in the permanent generation. The string instances created by StringBuilder are in the heap, so they are not equal;
In JDK1.7, the intern() method does not copy the instance, but only records the reference of the first instance that appears in the constant pool. Therefore, the reference returned by intern is the same as the instance created by StringBuilder, so it returns true;
Therefore, the test code for constant pool overflow will not have a constant pool overflow exception, but may have an insufficient heap memory overflow exception after continuous running;
Then you need to test the overflow of the method area, just keep adding things to the method area, such as class names, access modifiers, constant pools, etc. We can let the program load a large number of classes to continuously fill the method area, which leads to overflow. We use CGLib to directly manipulate the bytecode to generate a large number of dynamic classes:
/** * Method area memory overflow test class* @author hwz * */public class MethodAreaOOM { public static void main(String[] args) { //Use GCLib to create subclasses infinitely while (true) { Enhancer enhancer = new Enhancer(); enhancer.setSuperclass(MAOOMClass.class); enhancer.setUseCache(false); enhancer.setCallback(new MethodInterceptor() { @Override public Object intercept(Object obj, Method method, Object[] args, MethodProxy proxy) throws Throwable { return proxy.invokeSuper(obj, args); } }); enhancer.create(); } } static class MAOOMClass {}} Through VisualVM observation, we can see that the number of JVM loaded classes increases in a straight line with the use of PerGen:
4. Direct memory overflow
The size of direct memory can be set through the virtual machine parameters : -XX:MaxDirectMemorySize . To make direct memory overflow, you only need to continuously apply for direct memory. The following is the same as the direct memory cache test in Java NIO:
/** * Virtual machine parameters:<br> * -XX:MaxDirectMemorySize=30M Direct memory size* @author hwz * */public class DirectMemoryOOm { public static void main(String[] args) { List<Buffer> buffers = new ArrayList<Buffer>(); int i = 0; while (true) { //Print the current System.out.println(++i); //Direct memory consumption by continuously applying for direct buffer memory consumption in the cache buffer.add(ByteBuffer.allocateDirect(1024*1024)); //Accounting 1M each time } }} In the loop, each time 1M direct memory is applied, the maximum direct memory is set to 30M, and an exception is thrown when the program runs 31 times: java.lang.OutOfMemoryError: Direct buffer memory
4. Summary
The above is all the content of this article. This article mainly describes the layout structure of memory, object storage and memory exceptions that may occur in various memory areas in the JVM; the main reference book "In-depth understanding of Java Virtual Machine (Second Edition)". If there is any incorrectness, please point it out in the comments; thank you for your support for Wulin.com.