1 Overview
Java is known to support platform agnostic, security, and network mobility. The Java platform is composed of Java virtual machines and Java core classes, which provides a unified programming interface for pure Java programs, regardless of the lower operating system. It is precisely because of the Java virtual machine that its claim to be "compiled once, run everywhere" can be guaranteed.
1.1 Java program execution process
The execution of Java programs depends on the compilation environment and the running environment. The source code is converted into executable machine code, which is completed by the following process:
The core of Java technology is the Java virtual machine, because all Java programs run on the virtual machine. The operation of Java programs requires the cooperation of Java virtual machine, Java API and Java Class files. The Java virtual machine instance is responsible for running a Java program. When a Java program is started, a virtual machine instance is born. When the program ends, the virtual machine instance dies.
Java's cross-platform feature is because it has virtual machines targeting different platforms.
1.2 Java Virtual Machine
The main task of a Java virtual machine is to load class files and execute the bytecodes therein. As can be seen from the figure below, the Java virtual machine contains a class loader, which can load class files from programs and APIs. Only classes required for program execution will be loaded in Java API, and the bytecode is executed by the execution engine.
When a Java virtual machine is implemented by software on the host operating system, the Java program interacts with the host by calling local methods. Java methods are written in Java language, compiled into bytecode, and stored in class files. The local method is written in C/C++/assembly language, compiled into processor-related machine code, stored in dynamic link library, and the format is proprietary to each platform. So the local method is to connect Java programs to the underlying host operating system.
Since the Java virtual machine does not know how a class file was created and whether it was tampered with, it implements a class file detector to ensure that the types defined in the class file can be used safely. The class file checker ensures the robustness of the program through four independent scans:
・Collection of class file
・Semantic check of type data
・Bytecode verification
・Symbol Reference Verification
When executing bytecode, Java virtual machines also perform other built-in security mechanisms. They are the characteristics of ensuring the robustness of Java programs as Java programming languages, and are also the characteristics of Java virtual machines:
・Type-safe reference conversion
・Structured memory access
・Automatic garbage collection
・Array boundary check
・Empty quote check
1.3 Java virtual machine data type
Java virtual machines perform calculations through certain data types. Data types can be divided into two types: basic types and reference types, as shown in the figure below:
But boolean is a bit special. When the compiler compiles Java source code into bytecode, it will represent boolean with int or byte. In Java virtual machines, false is represented by 0, and true is represented by all non-zero integers. Like the Java language, the value range of the basic type of a Java virtual machine is consistent everywhere, no matter what the host platform is, a long is always a signed integer with 64-bit two's complement in any virtual machine.
For returnAddress, this basic type is used to implement finally clauses in Java programs. Java programmers cannot use this type, and its value points to the opcode of a virtual machine instruction.
2Architecture
In the Java Virtual Machine Specification, the behavior of a virtual machine instance is described in terms of subsystem, memory area, data type, and instructions, and these components together show the abstract virtual machine internal architecture.
2.1class file
The Javaclass file contains all information about a class or interface. The "base type" of the class file is as follows:
| u1 | 1 byte, unsigned type |
| u2 | 2 bytes, unsigned type |
| u4 | 4 bytes, unsigned type |
| u8 | 8 bytes, unsigned type |
If you want to know more, Oracle's JVM SE7 gives the official specification: The Java® Virtual Machine Specification
The contents of the class file:
ClassFile { u4 magic; // Magic number: 0xCAFEBABE, used to determine whether it is a Java class file u2 minor_version; //Minor version number u2 major_version; //Main version number u2 constant_pool_count; //Constant pool size cp_info constant_pool[constant_pool_count-1]; //Constant pool u2 access_flags; //Access flags at class and interface levels (obtained through | operation) u2 this_class; //Class index (pointing to class constants in constant pool) u2 super_class; //Present class index (pointing to class constants in constant pool) u2 interfaces_count; //Interfaces index counter u2 interfaces[interfaces_count]; //Interface index set u2 fields_count; //field count counter field_info fields[fields_count]; //field table set u2 methods_count; //Method count counter method_info methods[methods_count]; //Method table set u2 attributes_count; //Number of attribute attribute_info attributes[attributes_count]; //Attribute table} 2.2 Class loader subsystem
The class loader subsystem is responsible for finding and loading type information. In fact, there are two types of loaders for Java virtual machines: system loaders and user-defined loaders. The former is part of the Java virtual machine implementation, while the latter is part of the Java program.
・Bootstrapclassloader: It is used to load the core library of Java, implemented in native code, and is not inherited from java.lang.ClassLoader.
・Extensionclassloader: It is used to load Java extension libraries. The implementation of the Java virtual machine will provide an extension library directory. This class loader looks for and loads Java classes in this directory.
・Application class loader: It loads Java classes according to the classpath of Java application (CLASSPATH). Generally speaking, Java application classes are loaded by it. It can be obtained through ClassLoader.getSystemClassLoader().
In addition to the class loaders provided by the system, developers can implement their own class loaders by inheriting the java.lang.ClassLoader class to meet some special needs.
The class loader subsystem involves several other components of the Java virtual machine and classes from the java.lang library. The method defined by ClassLoader provides an interface for the program to access the class loader mechanism. In addition, for each type that is loaded, the Java virtual machine creates an instance of the java.lang.Class class to represent the type. Like other objects, user-defined class loaders and instances of Class are placed in the heap area in memory, while the loaded type information is located in the method area.
In addition to locating and importing binary class files, the class loader subsystem must also be responsible for verifying the correctness of the imported class, allocating and initializing memory for class variables, and parsing symbolic references. These actions also need to be performed in the following order:
・Load (find and load binary data of type)
・Connection (execution verification: Ensure the correctness of the imported type; preparation: allocate memory for class variables and initialize them to default values; parsing: convert symbolic references in the type into direct references)
・Initialization (class variables are initialized to the correct initial value)
2.3 Method area
In a Java virtual machine, information about the loaded type is stored in memory in a method area. When a virtual machine loads a certain type, it uses a class loader to locate the corresponding class file, then reads the class file and transfers it to the virtual machine. Then the virtual machine extracts the type information in it and stores this information in the method area. Method areas can also be collected by the garbage collector, because the virtual machine allows dynamic extension of Java programs through user-defined class loaders.
The following information is stored in the method area:
・This type of fully qualified name (such as the fully qualified name java.lang.Object)
・The fully qualified name of this type of direct superclass
・Is this type class type or interface type
・This type of access modifier (a subset of public, abstract, final)
・Sorted list of fully qualified names for any direct hyperinterface
・Constant pool of this type (an ordered collection including direct constants [string, integer and floatingpoint constants] and symbolic references to other types, fields and methods)
・Field information (field name, type, modifier)
・Method information (method name, return type, number of parameters and type, modifier)
・All class (static) variables except constants
・Reference to ClassLoader class (when each type is loaded, the virtual machine must track whether it is loaded by the startup class loader or the user-defined class loader)
・Reference to Class class (for each type that is loaded, the virtual machine will create an instance of the java.lang.Class class accordingly. For example, if you have a reference to the object of the java.lang.Integer class, then you only need to call the getClass() method referenced by the Integer object to get the Class object representing the java.lang.Integer class)
2.4 heap
All class instances or arrays created by Java programs at runtime (arrays are a real object in a Java virtual machine) are placed in the same heap. Since Java virtual machine instances only have one heap space, all threads will share this heap. It should be noted that the Java virtual machine has an instruction to allocate objects in the heap, but does not have an instruction to free up memory, because the virtual machine handed over this task to the garbage collector for processing. The Java virtual machine specification does not enforce garbage collectors, it only requires that virtual machine implementations must manage their own heap space "in some way". For example, an implementation may only have a fixed-size heap space. When the space is filled, it simply throws an OutOfMemory exception, which does not consider the issue of recycling garbage objects, but it complies with the specifications.
The Java virtual machine specification does not specify how Java objects are represented in the heap, which gives the implementer of the virtual machine decisions about how to design. A possible heap design is as follows:
A handle pool, an object pool. An object's reference is a local pointer to the handle pool. The benefits of this design are conducive to the sorting of heap fragments. When moving objects in the object pool, the handle part only needs to change the new address of the pointer pointing to the object. The disadvantage is that each time an instance variable of an object is accessed, it must be passed through two pointers.
2.5 Java Stack
Whenever a thread is started, the Java virtual machine allocates a Java stack to it. A Java stack consists of many stack frames, one stack frame contains the state of a Java method call. When a thread calls a Java method, the virtual machine pushes a new stack frame into the thread's Java stack. When the method returns, the stack frame pops up from the Java stack. The Java stack stores the status of Java method calls in threads - including local variables, parameters, return values, and intermediate results of operations, etc. Java virtual machines have no registers, and their instruction set uses a Java stack to store intermediate data. The reason for this design is to keep the instruction set of the Java virtual machine as compact as possible, and also facilitate the implementation of the Java virtual machine on a platform with few general registers. In addition, the stack-based architecture also helps to optimize the code of dynamic compilers and instant compilers implemented by certain virtual machines during runtime.
2.5.1 Stack Frame
A stack frame consists of a local variable area, an operand stack and a frame data area. When a virtual machine calls a Java method, it obtains the local variable area and operand stack size of this method from the type information of the corresponding class, and allocates the stack frame memory according to this, and then pushes it into the Java stack.
2.5.1.1 Local variable area
The local variable area is organized into an array counted from 0 in units of word length. The bytecode instruction uses the data in it through an index starting from 0. Values of types int, float, reference and returnAddress occupy one item in the array, while values of types byte, short and char are converted to int values before being stored in the array, and also occupy one item. But values of types long and double occupy two consecutive terms in the array.
2.5.1.2 Operand Stack
Like the local variable area, the operand stack is also organized into an array in word length. It accesses through standard stack operations-stack and stack out. Since the program counter cannot be directly accessed by program instructions, the instructions of the Java virtual machine obtain operands from the operand stack, so its operation is based on the stack rather than on registers. The virtual machine takes the operand stack as its workspace, because most instructions have to pop up data from here, perform operations, and then push the result back to the operand stack.
2.5.1.3 frame data area
In addition to the local variable area and operand stack, Java stack frames also need frame data areas to support constant pool analysis, normal method return, and exception dispatch mechanisms. Whenever a virtual machine wants to execute an instruction that requires constant pool data, it accesses it through a pointer to the constant pool in the frame data area. In addition to the parsing of constant pools, the frame data area also helps the virtual machine to handle the normal end or abnormal abort of Java methods. If the return ends normally, the virtual machine must restore the stack frame of the method initiating the call, including setting the program counter to point to the next instruction initiating the call method; if the method has a return value, the virtual machine needs to push it into the operand stack of the method initiating the call. To handle exception exits during Java method execution, the frame data area also holds a reference to the exception table of this method.
2.6 Program Counter
For a running Java program, each thread has its program counter. Program counters are also called PC registers. The program counter can hold both a local pointer and a returnAddress. When a thread executes a Java method, the value of the program counter is always the address of the next executed instruction. The address here can be a local pointer or an offset in the method bytecode relative to the method start instruction. If the thread is executing a local method, the value of the program counter is "undefined".
2.7 Local method stack
Any local method interface will use some kind of local method stack. When a thread calls a Java method, the virtual machine creates a new stack frame and pushes it into the Java stack. When it calls a local method, the virtual machine keeps the Java stack unchanged and no longer pushes into the new stack in the threaded Java stack. The virtual machine simply connects dynamically and directly calls the specified local method.
The method area and the heap are shared by all threads in the virtual machine instance. When the virtual machine loads a class file, it parses the type information from the binary data contained in the class file, and then places the type information in the method area. When the program is running, the virtual machine places all objects created by the program at runtime into the heap.
Like other runtime memory areas, the memory area occupied by the local method stack can be dynamically expanded or shrinked as needed.
3 Execution Engine
In the Java virtual machine specification, the behavior of the execution engine is defined using instruction sets. The designer implementing the execution engine will decide how to execute bytecode, the implementation can be interpreted, compiled on the fly, or executed directly using instructions on the chip, or a mixture of them.
The execution engine can be understood as an abstract specification, a concrete implementation, or a running instance. Abstract specifications use instruction sets to specify the behavior of the execution engine. A specific implementation may use a variety of different technologies - including software, hardware or a combination of tree technology. The execution engine as a runtime instance is a thread.
Each thread of a running Java program is an instance of an independent virtual machine execution engine. From the beginning to the end of the thread lifecycle, it is either executing bytecode or executing a local method.
3.1 Instruction Set
The bytecode stream of the method is composed of a sequence of instructions from a Java virtual machine. Each instruction contains a single-byte opcode followed by 0 or more operands. The opcode represents the operation to be performed; the operand provides the Java virtual machine with additional information needed to execute the opcode. When a virtual machine executes an instruction, it may use the items in the current constant pool, the values in the local variable of the current frame, or the values at the top of the operand stack of the current frame.
The abstract execution engine executes one bytecode instruction at a time. Each thread (execution engine instance) of a program running in a Java virtual machine performs this operation. The execution engine obtains the opcode, and if the opcode has an operand, it obtains its operand. It performs the action specified by the opcode and the follow operand, and then obtains the next opcode. This process of executing bytecode will continue until the thread is completed, and the completion of the thread can be marked by returning from its initial method or not catching the thrown exception.
4 local method interface
The Java local interface, also called JNI (JavaNativeInterface), is prepared for portability. The local method interface allows the local method to do the following:
Pass or return data
Operation instance variables
Operate class variables or call class methods
Operand array
Lock the heap object
Load new class
throw an exception
Catch the exception thrown by a local method calling a Java method
Capture asynchronous exception thrown by virtual machine
Indicates that a garbage collector object is no longer needed
Summarize
The above is all about this article on an in-depth understanding of the Java virtual machine architecture, and I hope it will be helpful to everyone. Interested friends can continue to refer to other related topics on this site. If there are any shortcomings, please leave a message to point it out. Thank you friends for your support for this site!