Java 中的垃圾收集
Java 中的垃圾收集是一个先进主题。Java GC 知识有助于我们微调应用程序运行时性能。
Java 中的垃圾收集
- 在 Java 中,程序员无需关心销毁不再使用的对象。垃圾收集器会处理这件事。
- 垃圾收集器是一个在后台持续运行的守护线程。基本上,它通过销毁无法访问的对象来释放堆内存。
- 无法访问的对象是程序的任何部分都不再引用的对象。
- 我们可以通过JVM选项为我们的 Java 程序选择垃圾收集器,我们将在本教程的后面部分研究这些内容。
自动垃圾收集如何工作?
自动垃圾收集是一个查看堆内存、识别(也称为“标记”)无法访问的对象并通过压缩销毁它们的过程。这种方法的一个问题是,随着对象数量的增加,垃圾收集时间不断增加,因为它需要遍历整个对象列表,寻找无法访问的对象。然而,对应用程序的实证分析表明,大多数对象都是短暂存在的。这种行为被用来提高 JVM 的性能,所采用的方法通常称为分代垃圾收集。在这种方法中,堆空间被分为年轻代、老生代或终身代和永久代等几代。年轻代堆空间是所有新对象创建的新空间。一旦填满,就会发生次要垃圾收集(也称为次要 GC)。这意味着,这一代的所有死对象都被销毁了。这个过程很快,因为从图中我们可以看到,它们中的大多数都会死掉。年轻代中存活的对象会老化,最终会移至老一代。老一代用于存储长期存活的对象。通常,会为年轻代对象设置一个阈值,当达到该年龄时,对象将移至老一代。最终,需要收集老一代。此事件称为 Major GC(主要垃圾收集)。通常,它要慢得多,因为它涉及所有活动对象。此外,还有完整 GC,这意味着清理整个堆 - 年轻代和老一代空间。最后,直到 Java 7,都有一个永久代(或 Perm Gen),它包含 JVM 描述应用程序中使用的类和方法所需的元数据。它在 Java 8 中被删除了。
Java 垃圾收集器
The JVM actually provides four different garbage collectors, all of them generational. Each one has their own advantages and disadvantages. The choice of which garbage collector to use lies with us and there can be dramatic differences in the throughput and application pauses. All these, split the managed heap into different segments, using the age-old assumptions that most objects in the heap are short-lived and should be recycled quickly. So, the four types of garbage collectors are:
Serial GC
This is the simplest garbage collector, designed for single threaded systems and small heap size. It freezes all applications while working. Can be turned on using -XX:+UseSerialGC
JVM option.
Parallel/Throughput GC
This is JVM’s default collector in JDK 8. As the name suggests, it uses multiple threads to scan through the heap space and perform compaction. A drawback of this collector is that it pauses the application threads while performing minor or full GC. It is best suited if applications that can handle such pauses, and try to optimize CPU overhead caused by the collector.
The CMS collector
The CMS collector (“concurrent-mark-sweep”) algorithm uses multiple threads (“concurrent”) to scan through the heap (“mark”) for unused objects that can be recycled (“sweep”). This collector goes in Stop-The-World(STW) mode in two cases: -While initializing the initial marking of roots, ie. objects in the old generation that are reachable from thread entry points or static variables -When the application has changed the state of the heap while the algorithm was running concurrently and forcing it to go back and do some final touches to make sure it has the right objects marked. This collector may face promotion failures. If some objects from young generation are to be moved to the old generation, and the collector did not have enough time to make space in the old generation space, a promotion failure will occur. In order to prevent this, we may provide more of the heap size to the old generation or provide more background threads to the collector.
G1 collector
Last but not the least is the Garbage-First collector, designed for heap sizes greater than 4GB. It divides the heap size into regions spanning from 1MB to 32Mb, based on the heap size. There is a concurrent global marking phase to determine the liveliness of objects throughout the heap. After the marking phase is complete, G1 knows which regions are mostly empty. It collects unreachable objects from these regions first, which usually yields a large amount of free space. So G1 collects these regions(containing garbage) first, and hence the name Garbage-First. G1 also uses a pause prediction model in order to meet a user-defined pause time target. It selects the number of regions to collect based on the specified pause time target. The G1 garbage collection cycle includes the phases as shown in the figure:
-
Young-only phase: This phase includes only the young generation objects and promotes them to the old generation. The transition between the young-only phase and the space-reclamation phase starts when the old generation is occupied up to a certain threshold, ie. the Initiating Heap Occupancy threshold. At this time, G1 schedules an Initial Mark young-only collection instead of a regular young-only collection.
-
Initial Marking: This type of collection starts the marking process in addition to a regular young-only collection. Concurrent marking determines all currently live objects in the old generation regions to be kept for the following space-reclamation phase. While marking hasn’t completely finished, regular young-only collections may occur. Marking finishes with two special stop-the-world pauses: Remark and Cleanup.
-
Remark: This pause finalizes the marking itself, and performs global reference processing and class unloading. Between Remark and Cleanup G1 calculates a summary of the liveness information concurrently, which will be finalized and used in the Cleanup pause to update internal data structures.
-
Cleanup: This pause also takes the completely empty regions, and determines whether a space-reclamation phase will actually follow. If a space-reclamation phase follows, the young-only phase completes with a single young-only collection.
-
Space-reclamation phase: This phase consists of multiple mixed collections – in addition to young generation regions, also evacuates live objects of old generation regions. The space-reclamation phase ends when G1 determines that evacuating more old generation regions wouldn’t yield enough free space worth the effort.
G1 can be enabled using the –XX:+UseG1GC
flag. This strategy reduced the chances of the heap being depleted before the background threads have finished scanning for unreachable objects. Also, it compacts the heap on-the-go, which the CMS collector can do only in STW mode. In Java 8 a beautiful optimization is provided with G1 collector, called string deduplication. As we know the character arrays that represent our strings occupies much of our heap space. A new optimization has been made that enables the G1 collector to identify strings which are duplicated more than once across our heap and modify them to point to the same internal char[] array, to avoid multiple copies of the same string residing in the heap unnecessarily. We can use the -XX:+UseStringDeduplication
JVM argument to enable this optimization. G1 is the default garbage collector in JDK 9.
Java 8 PermGen and Metaspace
As mentioned earlier, the Permanent Generation space was removed since Java 8. So now, the JDK 8 HotSpot JVM uses the native memory for the representation of class metadata which is called Metaspace. Most of the allocations for the class metadata are made out of the native memory. Also, there is a new flag MaxMetaspaceSize, to limit the amount of memory used for class metadata. If we do not specify the value for this, the Metaspace re-sizes at runtime as per the demand of the running application. Metaspace garbage collection is triggered when the class metadata usage reaches MaxMetaspaceSize limit. Excessive Metaspace garbage collection may be a symptom of classes, classloaders memory leak or inadequate sizing for our application. That’s it for the Garbage Collection in java. I hope you got the understanding about different garbage collectors we have in java. References: Oracle Documentation, G1 GC.