Java constant pool is a long-lasting topic and is also the interviewer's favorite. There are many kinds of questions. I'll summarize it this time.
theory
First, let’s express the jvm virtual memory distribution:
The program counter is a pipeline for jvm to execute the program, storing some jump instructions. This is too profound and I don't understand.
The local method stack is the stack used by jvm to call operating system methods.
The virtual machine stack is the stack used by jvm to execute java code.
The method area stores some constants, static variables, class information, etc., which can be understood as the storage location of the class file in memory.
The virtual machine heap is the heap used by jvm to execute java code.
Constant pools in Java are actually divided into two forms: static constant pools and runtime constant pools .
The so-called static constant pool is the constant pool in the *.class file. The constant pool in the class file not only contains string (number) literals, but also contains information about classes and methods, occupying most of the space of the class file.
The runtime constant pool is the jvm virtual machine loads the constant pool in the class file into memory after completing the class loading operation and saves it in the method area . The constant pool we often call refers to the runtime constant pool in the method area.
Next, we quote some examples of constant pools that are popular on the Internet and then explain them.
String s1 = "Hello"; String s2 = "Hello"; String s3 = "Hel" + "lo"; String s4 = "Hel" + new String("lo"); String s5 = new String("Hello"); String s6 = s5.intern(); String s7 = "H"; String s8 = "ello"; String s9 = s7 + s8;System.out.println(s1 == s2); // trueSystem.out.println(s1 == s3); // trueSystem.out.println(s1 == s4); // falseSystem.out.println(s1 == s9); // falseSystem.out.println(s4 == s5); // falseSystem.out.println(s1 == s6); // trueFirst of all, in Java, the == operator is used directly, and the reference addresses of two strings are compared, not the contents. Please use String.equals() to compare the contents.
s1 == s2 is very easy to understand. When s1 and s2 are assigned, they use string literals. To put it bluntly, they directly write the string to death. During compilation, this literal will be placed directly into the constant pool of the class file, thereby realizing reuse. After loading the constant pool at runtime, s1 and s2 point to the same memory address, so they are equal.
There is a pit in s1 == s3. Although s3 is a string spliced out dynamically, all the parts involved in the splicing are known literals. During the compilation period, this splicing will be optimized, and the compiler will directly help you splice it. Therefore, String s3 = "Hel" + "lo"; is optimized to String s3 = "Hello"; in the class file, so s1 == s3 is true.
s1 == s4 is of course not equal. Although s4 is also spliced, the new String("lo") part is not a known literal, but an unpredictable part. The compiler will not optimize it. You must wait until run to determine the result. Combined with the string invariance theorem, you know where s4 is allocated, so the address must be different. A brief picture to clarify the idea:
s1 == s9 is not equal, and the reason is similar. Although the string literals used by s7 and s8 when assigning values, when splicing into s9, s7 and s8 are both unpredictable. After all, the compiler is a compiler and cannot be used as an interpreter, so it is not optimized. When it is run, the new string spliced into s7 and s8 is unsure in the heap and cannot be the same as the s1 address in the constant pool of the method area.
s4 == s5 is no longer needed to be explained, it is definitely not equal, both are in the heap, but the addresses are different.
The equality of s1 == s6 is completely attributed to the intern method. S5 is in the heap and the content is Hello. The intern method will try to add the Hello string to the constant pool and return its address in the constant pool. Because there is a Hello string in the constant pool, the intern method directly returns the address; while s1 already points to the constant pool during the compilation period, so s1 and s6 point to the same address, which is equal.
At this point, we can draw three very important conclusions:
You must pay attention to the behavior during the compilation period in order to better understand the constant pool.
Constants in the runtime constant pool basically come from the constant pool in each class file.
When the program is running, jvm will not automatically add constants to the constant pool unless it manually adds constants to the constant pool (such as calling the intern method).
The above only involves string constant pools. In fact, there are integer constant pools, floating point constant pools, etc., but they are similar, but constant pools of numerical types cannot be manually added. The constants in the constant pool are determined when the program starts. For example, the constant range in the integer constant pool is: -128~127. Only numbers in this range can be used for the constant pool.
practice
Having said so much theory, let's touch on the real constant pool.
As mentioned earlier, there is a static constant pool in the class file. This constant pool is generated by the compiler and is used to store literals in the java source file (this article only focuses on literals). Suppose we have the following java code:
String s = "hi";
For convenience, it's that simple, that's right! After compiling the code into a class file, use winhex to open the binary format class file. As shown in the picture:
Let’s briefly explain the structure of the class file. The 4 bytes at the beginning are the magic number of the class file, which is used to identify this as a class file. To put it bluntly, it is the file header, which is: CA FE BA BE.
The next 4 bytes are the version number of Java, and the version number here is 34, because the author is compiled with jdk8, and the version number corresponds to the level of jdk version. The higher version can be compatible with the lower version, but the lower version cannot execute the higher version. So, if one day readers want to know what jdk version other people’s class file is compiled with, you can look at these 4 bytes.
Next is the constant pool entrance. The number of constant pool constants is identified by 2 bytes at the entrance. In this example, the value is 00 1A. It is translated into decimal and is 26, which means there are 25 constants. The 0th constant is a special value, so there are only 25 constants.
The constant pool stores various types of constants. They all have their own types and their own storage specifications. This article only focuses on string constants. String constants start with 01 (1 byte), and then record the string length with 2 bytes, and then the actual content of the string. In this case, it is: 01 00 02 68 69.
Next, let’s talk about the runtime constant pool. Since the runtime constant pool is in the method area, we can set the method area size through jvm parameters: -XX:PermSize, -XX:MaxPermSize, thereby indirectly limiting the constant pool size.
Suppose the jvm startup parameter is: -XX:PermSize=2M -XX:MaxPermSize=2M, and then run the following code:
//Keep references to prevent automatic garbage collection List<String> list = new ArrayList<String>();int i = 0;while(true){//Manually add constant list.add(String.valueOf(i++).intern());}The program will immediately throw: Exception in thread "main" java.lang.outOfMemoryError: PermGen space exception. PermGen space is the method area, which is enough to indicate that the constant pool is in the method area.
In jdk8, the method area was removed and the Metaspace area was replaced. Therefore, we need to use the new jvm parameter: -XX:MaxMetaspaceSize=2M, and still run the above code, throwing: java.lang.OutOfMemoryError: Metaspace exception. Similarly, it is explained that the runtime constant pool is divided into the Metaspace area. For specific knowledge about the Metaspace area, please search for it yourself.
All codes in this article have been tested and passed under jdk7 and jdk8. Other versions of jdk may have slight differences. Please explore it yourself.