When it comes to Garbage Collection (GC), many people will naturally associate it with Java. In Java, programmers do not need to care about dynamic memory allocation and garbage collection, all of this is left to the JVM to handle.
As the name suggests, garbage collection is to free up the space occupied by garbage. So in Java, what kind of objects will be considered "garbage"? So when some objects are determined to be garbage, what strategy should be used to recycle (free up space)? What are the typical garbage collectors in current commercial virtual machines? Let’s discuss these issues one by one. Here is the directory outline of this article:
How to determine if an object is "garbage"?
Typical garbage collection algorithm Typical garbage collector
1. How to determine if an object is "garbage"?
In this section, we first understand the most basic question: If we determine that an object is "garbage"? Since the task of the garbage collector is to recycle the space occupied by the garbage object for use by new objects, how does the garbage collector determine that an object is "garbage"? That is, how to judge that an object can be recycled.
In Java, it is associated with objects through references, that is, if you want to operate objects, it must be done through references. Then it is obvious that an easy way is to judge whether an object can be recycled by reference counting. Without losing generality, if an object has no references associated with it, it means that the object is basically unlikely to be used elsewhere, and then the object becomes a recyclable object. This method becomes the reference counting method.
This method is characterized by its simple implementation and high efficiency, but it cannot solve the problem of circular references, so this method is not adopted in Java (Python uses reference counting method). Look at the following code:
public class Main { public static void main(String[] args) { MyObject object1 = new MyObject(); MyObject object2 = new MyObject(); object1.object = object2; object2.object = object1; object1 = null; object2 = null; }}class MyObject{ public Object object = null;}The last two sentences assign object1 and object2 to null, which means that the objects pointed to by object1 and object2 are no longer accessible, but because they refer to each other, their reference count is not 0, then the garbage collector will never recycle them.
To solve this problem, accessibility analysis method is adopted in Java. The basic idea of this method is to search through a series of "GC Roots" objects as the starting point. If there is no accessible path between "GC Roots" and an object, the object is said to be unreachable. However, it should be noted that the object judged to be unreachable may not necessarily become a recyclable object. An object that is judged as unreachable must go through at least two marking processes to become a recyclable object. If there is still no possibility of becoming a recyclable object during these two marking processes, it will basically become a recyclable object.
As for how the accessibility analysis method is operated, I have not understood it very clearly yet. If any friend is more clear, please give me some advice.
Let’s see an example below:
Object aobj = new Object ( ) ;Object bobj = new Object ( ) ;Object cobj = new Object ( ) ;aobj = bobj;aobj = cobj;cobj = null;aobj = null;
Which line may make an object recyclable? The code on line 7 will cause objects to become recyclable objects. As for why it is left to readers to think for themselves.
Let’s take a look at another example:
String str = new String("hello");SoftReference<String> sr = new SoftReference<String>(new String("java"));WeakReference<String> wr = new WeakReference<String>(new String("world"));Which of these three sentences will make a String object recyclable? Sentence 2 and 3, and sentence 2 will determine the String object as a recyclable object when there is insufficient memory, and in the third sentence, the String object will be determined as a recyclable object in any case.
Finally, let’s summarize the common situations in which objects are judged as recyclable objects that are usually encountered:
1) Display the value of a reference to null or point the reference that has already pointed to an object to a new object, such as the following code:
Object obj = new Object();obj = null;Object obj1 = new Object();Object obj2 = new Object();obj1 = obj2;
2) Local reference to the object pointed to, such as the following code:
void fun() {...... for(int i=0;i<10;i++) { Object obj = new Object(); System.out.println(obj.getClass()); } }Every time the loop is executed, the generated Object object will become a recyclable object.
3) Only weak references are associated with objects, such as:
WeakReference<String> wr = new WeakReference<String>(new String("world"));2. Typical garbage collection algorithm
After determining which garbage can be recycled, the garbage collector needs to do is to start garbage collection, but one problem involved is: how to efficiently collect garbage. Since the Java virtual machine specification does not make clear regulations on how to implement a garbage collector, the virtual machines of each manufacturer can implement a garbage collector in different ways, so only the core ideas of several common garbage collection algorithms are discussed here.
1.Mark-Sweep (mark-clear) algorithm
This is the most basic garbage collection algorithm. The reason why it is said to be the most basic is that it is the easiest to implement and the simplest idea. The mark-clearing algorithm is divided into two stages: the marking stage and the clearing stage. The task of the marking stage is to mark all objects that need to be recycled, and the clearing stage is to recycle the space occupied by the marked objects. The specific process is shown in the figure below:
It can be easily seen from the figure that the mark-clearing algorithm is easier to implement, but there is a serious problem that it is easy to generate memory fragments. Too many fragments may cause the inability to find enough space when allocating space for large objects in the subsequent process, and triggering a new garbage collection action in advance.
2. Copying (copy) algorithm
In order to solve the shortcomings of the Mark-Sweep algorithm, the Copying algorithm was proposed. It divides available memory into two pieces of equal size by capacity, using only one piece at a time. When this piece of memory is used up, copy the still-living object to another piece, and then clean up the used memory space at once, so that memory fragmentation problems will not occur. The specific process is shown in the figure below:
Although this algorithm is simple to implement, efficient to run and not easy to generate memory fragmentation, it is expensive to use the memory space because the memory that can be used is reduced to half of the original one.
Obviously, the efficiency of the Copying algorithm has a lot to do with the number of surviving objects. If there are many surviving objects, the efficiency of the Copying algorithm will be greatly reduced.
3.Mark-Compact (mark-collation) algorithm
In order to solve the shortcomings of the Copying algorithm and make full use of the memory space, the Mark-Compact algorithm is proposed. The algorithm marks the same as Mark-Sweep, but after completing the mark, it does not directly clean up the recyclable objects, but moves all the living objects to one end and then cleans up memory outside the end boundary. The specific process is shown in the figure below:
4. Generational Collection algorithm
Generation collection algorithm is currently used by most JVM garbage collectors. Its core idea is to divide memory into several different regions according to the life cycle of the object's survival. Generally speaking, the heap area is divided into the old generation and the Young Generation. The characteristic of the old generation is that only a small number of objects need to be recycled every time the garbage is collected, while the characteristic of the new generation is that a large number of objects need to be recycled every time the garbage is collected. Then the most suitable collection algorithm can be adopted according to the characteristics of different generations.
At present, most garbage collectors adopt the Copying algorithm for the new generation, because in the new generation, most objects need to be recycled every time the garbage collection is collected, which means that the number of operations that need to be copied is relatively small, but in reality, the space of the new generation is not divided according to a ratio of 1:1. Generally speaking, the new generation is divided into a larger Eden space and two smaller Survivor spaces. Each time the Eden space and one of the Survivor spaces are used, when recycled, the still-surviving objects in Eden and Survivor are copied to another Survivor space, and then the Eden and the Survivor spaces that have just been used are cleaned.
Because the old age is that only a small number of objects are recycled every time, the Mark-Compact algorithm is generally used.
Note that there is another generation outside the heap area, which is Permanet Generation, which is used to store class classes, constants, method descriptions, etc. The recycling of permanent generation mainly recycles two parts: discarded constants and useless classes.
3. Typical garbage collector
The garbage collection algorithm is the theoretical basis of memory recycling, and the garbage collector is the specific implementation of memory recycling. The following is a description of several garbage collectors provided by the HotSpot (JDK 7) virtual machine. Users can combine collectors used in each era according to their own needs.
1.Serial/Serial Old
The Serial/Serial Old Collector is the most basic and oldest collector. It is a single thread collector and must pause all user threads when it does garbage collection. The Serial collector is a collector for the new generation, using the Copying algorithm, and the Serial Old collector is a collector for the old generation, using the Mark-Compact algorithm. Its advantage is that it is simple and efficient, but its disadvantage is that it will cause pauses to users.
2.ParNew
The ParNew Collector is a multi-threaded version of the Serial Collector that uses multiple threads for garbage collection.
3. Parallel Scavenge
Parallel Scavenge collector is a new generation of multithreaded collectors (parallel collectors). It does not need to pause other user threads during recycling. It uses the Copying algorithm. This collector is different from the first two collectors. It is mainly to achieve a controlled throughput.
4. Parallel Old
Parallel Old is an old version of Parallel Scavenge collector (parallel collector) using multithreading and Mark-Compact algorithms.
5.CMS
The CMS (Current Mark Sweep) collector is a collector aimed at obtaining the shortest recovery pause time. It is a concurrent collector that uses the Mark-Sweep algorithm.
6.G1
The G1 collector is the most cutting-edge achievement in today's collector technology development. It is a collector for server-side applications that can make full use of multi-CPU and multi-core environments. Therefore it is a parallel and concurrency collector, and it can build a predictable pause time model.
Here are some things about memory allocation:
In a general direction, the memory allocation of objects is allocated on the heap. Objects are mainly allocated in the new generation of Eden Space and From Space, and in rare cases they will be directly allocated in the old age. If the space of the new generation of Eden Space and From Space is insufficient, a GC will be initiated. If the GC is performed, the Eden Space and From Space can accommodate the object and are placed in the Eden Space and From Space.
During the GC process, the surviving objects in Eden Space and From Space will be moved to To Space, and then the Eden Space and From Space will be cleaned. If To Space cannot be enough to store an object during cleaning, it will move the object to the old age. After GC, Eden space and To Space are used. The next time you GC, the surviving object will be copied to From Space, and the loop will be repeated. When an object escapes GC once in the Survivor area, its object's age will be increased by 1. By default, if the object reaches 15 years old, it will move to the middle age of the old age.
Generally speaking, large objects will be directly allocated to the old age. The so-called large objects refer to objects that require a large amount of continuous storage space. The most common type of large objects is large arrays, such as:
byte[] data = new byte[4*1024*1024]
This type of storage space will generally be allocated directly in the elderly.
Of course, the allocation rules are not 100% fixed, which depends on what kind of garbage collector combination and the relevant parameters of the JVM are currently being used.
The above is all the content of this article. I hope it will be helpful to everyone's learning and I hope everyone will support Wulin.com more.