CMS and G1 Collectors: Low-Latency Garbage Collection
How CMS and G1 garbage collectors reduce pause times through concurrent marking, region-based heap layout, and incremental compaction.
CMS and G1 Collectors: Low-Latency Garbage Collection
Serial and Parallel collectors are honest about what they are: stop-the-world collectors that freeze your application while they work. CMS and G1 take a different approach - they try to do some of their work concurrently, while your application is still running. The result is shorter pauses, but at the cost of complexity and CPU overhead.
This covers how CMS and G1 work, why G1 replaced CMS as the default, and what trade-offs you make when you choose either.
Introduction
CMS (Concurrent Mark Sweep) and G1 (Garbage First) are the JVM’s low-latency garbage collectors — designed to keep pause times short by performing most GC work concurrently with application threads rather than stopping the world. Serial and Parallel collectors freeze your application for the entire GC duration; CMS and G1 reduce pause times significantly by overlapping marking, sweeping, and compaction with your application’s execution. G1 replaced CMS as the default collector in Java 9 because CMS had fundamental reliability issues (floating garbage, fragmentation requiring fallback to Serial), but understanding both is essential for tuning older JVMs and reasoning about the design decisions that led to G1.
The key technical difference from older collectors is that both CMS and G1 use concurrent marking phases where application threads and GC threads run simultaneously, and G1 additionally divides the heap into equal-sized regions to enable incremental compaction and more predictable pause times. CMS uses free-list allocation which fragments memory and eventually requires a stop-the-world compaction; G1 uses a region-based layout that allows it to collect only the regions with the most garbage (“garbage first”) and compact incrementally. This post covers the internal mechanics of both, their failure modes, and the concrete scenarios where you would choose one over the other.
When to Use This Knowledge
Use when:
- Your application cannot tolerate multi-second GC pauses
- You have latency SLAs measured in hundreds of milliseconds
- Running server workloads where responsiveness matters (web servers, APIs, trading systems)
- You have heap sizes over 4GB where full stop-the-world GC becomes painful
Do not use when:
- Throughput is your only metric (batch processing, ETL)
- Your heap is small (under 1GB) and Serial/Parallel work fine
- You can already meet latency SLAs with Parallel GC
When NOT to Use This Knowledge
If your production environment runs ZGC or Shenandoah, most CMS and G1 tuning advice does not apply. These collectors work differently and require different approaches.
Java 21 LTS or later with strict latency SLAs? Start with ZGC. It eliminates pause time tuning complexity through automatic concurrent compaction. Time spent mastering G1 pause time targets when ZGC handles this automatically is opportunity cost wasted elsewhere.
CMS is gone, removed in Java 14. Learning CMS collector flags and failure modes is historical knowledge now unless you are maintaining legacy Java 8 systems. For new development, use modern collectors. For continued learning on modern JVM garbage collection, explore the Advanced Java & JVM Internals roadmap.
The Concurrent Approach
The core idea behind both CMS and G1 is simple: stop-the-world pauses are the enemy of latency. Instead of doing all GC work while the world is stopped, these collectors do as much work as possible concurrently - while application threads are running.
graph TB
subgraph Serial["Serial/Parallel GC"]
direction TB
A1["Stop The World"] --> A2["GC Work"] --> A3["Stop The World"] --> A4["GC Work"]
end
subgraph Concurrent["CMS / G1"]
direction TB
B1["Initial Mark\n(STW - short)"] --> B2["Concurrent Mark"] --> B3["Remark\n(STW - short)"] --> B4["Concurrent Sweep"] --> B5["Concurrent Sweep"]
end
The trade-off: doing GC work concurrently means using CPU cycles that could have gone to your application. You are buying shorter pauses with CPU time.
CMS Collector
CMS (Concurrent Mark Sweep) was the first low-latency collector in the JVM. It targets old generation GC, trying to collect it without long stop-the-world pauses.
How CMS Works
CMS phases:
-
Initial Mark (Stop-The-World, short): Mark reachable objects from GC roots. Fast - usually under 100ms.
-
Concurrent Mark: Traverse the object graph while application runs. Takes time but does not stop the world.
-
Remark (Stop-The-World, short): Finalize marking of objects modified during concurrent phase (the “floating garbage” problem).
-
Concurrent Sweep: Reclaim unmarked objects while application runs.
-
Concurrent Reset: Prepare CMS data structures for next cycle.
graph TB
A["Initial Mark\n(STW)"] --> B["Concurrent Mark\n(app running)"]
B --> C["Remark\n(STW)"]
C --> D["Concurrent Sweep\n(app running)"]
D --> E["Concurrent Reset"]
E --> A
Floating Garbage
CMS concurrent marking is imprecise. Objects that become unreachable during concurrent marking (modified by application threads) may not be reclaimed - they are called “floating garbage.” This garbage sits in the heap until the next CMS cycle.
This is a fundamental limitation of concurrent marking: you cannot stop the application while doing the full mark, so some objects slip through.
Key Characteristics
| Aspect | Behavior |
|---|---|
| GC threads | Multiple + concurrent |
| Pause time | Short pauses for Initial Mark and Remark |
| Throughput | Lower than Parallel - concurrent phases use CPU |
| Old gen collection | Mostly concurrent |
| Young gen collection | Serial or Parallel Copying |
| Heap fragmentation | Moderate - no compaction by default |
| Deprecated | Removed in Java 14 (replaced by G1) |
CMS Failure Modes
CMS can fail in two main ways:
-
Concurrent Mode Failure: CMS cannot finish collecting before old generation fills up. Triggers a full stop-the-world GC as fallback.
-
Promotion Failure: Object in young generation cannot fit in old generation during minor GC, triggering full GC.
JVM Flags for CMS
-XX:+UseConcMarkSweepGC
-XX:ParallelCMSThreads=N # CMS threads (default: (ncpus+3)/4)
-XX:CMSInitiatingOccupancyFraction=70 # Start CMS when old gen 70% full
-XX:+UseCMSInitiatingOccupancyOnly # Always trigger at the threshold
G1 Collector
G1 (Garbage First) replaced CMS as the default collector in Java 9. It takes a fundamentally different approach: instead of treating the heap as one contiguous space, it divides the heap into many equal-sized regions.
graph TB
subgraph G1Heap["G1 Heap - Divided into Regions"]
direction TB
R1["Region 1\n(Eden)"]
R2["Region 2\n(Survivor)"]
R3["Region 3\n(Old)"]
R4["Region 4\n(Old)"]
R5["Region 5\n(Eden)"]
R6["Region 6\n(Humongous)"]
end
How G1 Works
G1 divides the heap into regions of equal size (typically 1MB to 32MB depending on heap size). Each region can be Eden, Survivor, or Old generation - the designation is fluid and changes between GC cycles.
G1 tracks live object density per region. During a GC, it collects regions with the most garbage first - hence “Garbage First.”
graph TB
subgraph G1Cycles["G1 GC Cycles"]
A["Young GC\n(Copying - STW)"] --> B["Concurrent Mark\n(regions)"]
B --> C["Mixed GC\n(Evacuate old regions - STW)"]
C --> A
end
G1 GC Phases
-
Young GC (Stop-The-World): Copy live objects from young regions to survivor regions or promotes to old regions. Uses multiple threads.
-
Concurrent Marking: Like CMS, G1 performs concurrent marking to identify live objects across all regions. Runs in background threads.
-
Remark (Stop-The-World): Like CMS, finalizes marking of objects modified during concurrent phase.
-
Mixed GC: After concurrent marking, G1 may collect a mix of young and old regions. G1 selects regions with lowest live ratio first.
-
Cleanup: Frees completely empty regions, updates remembered sets.
Why G1 Replaced CMS
| Aspect | CMS | G1 |
|---|---|---|
| Heap division | Contiguous | Regions |
| Compaction | None by default (fragmentation) | Incremental (per-region) |
| Full GC | Falls back to Serial/Parallel | G1 handles it |
| JDK support | Removed in Java 14 | Default since Java 9 |
| Large heaps | CMS struggles >4GB | Scales well to large heaps |
| Predictability | Poor (no pause target) | Good (pause time target configurable) |
G1’s region-based design lets it compact incrementally - it does not need to compact the entire old generation at once. This makes pause times more predictable and manageable.
Pause Time Target
G1 lets you set a target for maximum GC pause time:
-XX:MaxGCPauseMillis=200 # Target 200ms max pause
G1 then tunes its collection to try to meet this target. Note: this is a target, not a guarantee. Under memory pressure, G1 may not be able to meet it.
Key Characteristics
| Aspect | Behavior |
|---|---|
| GC threads | Multiple + concurrent |
| Pause time | Configurable target via MaxGCPauseMillis |
| Throughput | Lower than Parallel GC (concurrent phases use CPU) |
| Heap layout | Region-based (1MB to 32MB regions) |
| Compaction | Incremental per-region |
| Young gen | Dynamic - size adjusts between Min and Max |
| Large objects | Humongous regions handle objects > 50% of region size |
JVM Flags for G1
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200 # Target max pause time
-XX:G1HeapRegionSize=N # Region size (1MB, 2MB, 4MB, 8MB, 16MB, 32MB)
-XX:InitiatingHeapOccupancyPercent=45 # Start GC when heap 45% full
-XX:G1ReservePercent=10 # Reserve 10% for promotion failures
Production Failure Scenarios
1. G1 Evacuation Failure
Symptom: GC pause for evacuation failure or to-space exhausted. Application freezes.
Cause: G1 ran out of free regions to copy survivor objects into. Happens when the to-space (target regions) fills up during a young or mixed GC.
Solution:
# Increase reserve space
-XX:G1ReservePercent=15
# Increase heap if possible
-Xms4g -Xmx4g
# Reduce expected pause time if target is too aggressive
-XX:MaxGCPauseMillis=500
2. CMS Concurrent Mode Failure
Symptom: CMS cannot complete before old generation fills. Full GC kicks in and freezes application.
Cause: Old generation filling faster than CMS can reclaim it. Usually means you need more heap, lower CMSInitiatingOccupancyFraction, or a different collector.
Solution:
# Start CMS earlier
-XX:CMSInitiatingOccupancyFraction=60
# Or just switch to G1
-XX:+UseG1GC
3. Humongous Object Issues (G1)
Symptom: Frequent long pauses with many humongous allocations.
Cause: Objects larger than half a G1 region (e.g., > 16MB with 32MB regions) are treated as “humongous” and collected differently. They can cause fragmentation in the humongous region space.
Solution:
# Increase region size if heap is large
-XX:G1HeapRegionSize=32m
# Avoid allocating very large buffers in hot paths
Trade-off Table
| Configuration | When to use | Trade-off |
|---|---|---|
-XX:+UseConcMarkSweepGC | Legacy systems, Java 8 with low-latency needs | Deprecated, no compaction, can fail badly |
-XX:+UseG1GC | Default for most low-latency workloads | More CPU overhead than Parallel GC |
-XX:MaxGCPauseMillis=100 | Strict latency SLA | May reduce throughput if too aggressive |
-XX:G1HeapRegionSize=16m | Large heaps (>16GB) | Smaller regions = more regions to manage |
-XX:CMSInitiatingOccupancyFraction=70 | Tune CMS trigger point | Too low = premature collections, too high = failure |
Implementation Snippets
Checking Which Collector Is Running
import java.lang.management.*;
import java.util.*;
public class CollectorCheck {
public static void main(String[] args) {
List<GarbageCollectorMXBean> gcs = ManagementFactory.getGarbageCollectorMXBeans();
for (GarbageCollectorMXBean gc : gcs) {
System.out.println("Collector: " + gc.getName());
System.out.println(" Cycles: " + gc.getCollectionCount());
System.out.println(" Time: " + gc.getCollectionTime() + "ms");
}
}
}
Enabling G1 and Setting Pause Target
java -XX:+UseG1GC \
-XX:MaxGCPauseMillis=150 \
-XX:G1HeapRegionSize=16m \
-XX:InitiatingHeapOccupancyPercent=45 \
-Xms4g -Xmx4g \
-Xlog:gc*:file=g1.log \
-jar myapp.jar
Reading G1 GC Logs
[GC pause (G1 Evacuation Pause) 256M->245M(4G) 148.734ms]
[Ext Root Scanning: 23.4ms]
[Update RS: 12.1ms]
[Scan RS: 8.3ms]
[Object Copy: 89.2ms]
G1 logs break down pause time by phase, making it easier to identify bottlenecks.
Observability Checklist
- Enable GC logging:
-Xlog:gc*:file=g1.logor-Xlog:gc*:file=cms.log - Monitor pause times from GC logs - look for pauses exceeding your target
- Track
jstat -gc <pid>forOC(old gen capacity) vsOU(old gen used) - Watch for evacuation failures in G1 logs:
to-space exhausted - Monitor CMS
concurrent mode failureevents - For G1: check
G1HeapRegionSizeis appropriate for your heap size - Use
-XX:PrintGCDetailsfor phase-level timing breakdown
Security Notes
- Concurrent phases use more CPU, which can reveal allocation patterns through timing side channels
- GC logs expose memory usage patterns and object lifetime characteristics
- JMX access to GC beans should be restricted - can reveal collector internals and tuning
- Large heaps mean larger heap dumps on OOM - more data to protect
Common Pitfalls / Anti-Patterns
| Pitfall | What happens | Fix |
|---|---|---|
| Setting pause target too low | G1 uses too much CPU trying to meet target | Increase MaxGCPauseMillis or add more heap |
| Using CMS on Java 11+ | Removed from JDK | Migrate to G1 |
| Ignoring G1 region size | Suboptimal performance on large heaps | Set G1HeapRegionSize explicitly |
| Too small G1ReservePercent | Evacuation failures | Increase to 15-20 |
| Setting old gen threshold too high (CMS) | Concurrent mode failure | Lower CMSInitiatingOccupancyFraction |
Quick Recap Checklist
- CMS = concurrent mark-sweep, deprecated in Java 14, no compaction
- G1 = region-based, incremental compaction, default since Java 9
- Both do work concurrently to reduce pause times
- Concurrent phases use CPU that could go to your application
- G1 pause target set via
-XX:MaxGCPauseMillis=N - G1 divides heap into 1MB-32MB regions
- CMS fallback: concurrent mode failure triggers full stop-the-world
- For Java 11+, use G1 - CMS is gone
Interview Questions
CMS treats the old generation as one contiguous space and performs mostly-concurrent marking and sweeping without compaction. G1 divides the entire heap into regions and performs incremental compaction - it compacts old generation region by region rather than all at once. This makes G1's pauses more predictable and prevents the fragmentation issues that plagued CMS.
CMS cannot finish its concurrent marking and sweeping before the old generation fills up. When this happens, the JVM falls back to a full stop-the-world GC (typically Serial or Parallel). This is usually caused by the occupancy threshold being set too high, too many allocations requiring promotion, or the concurrent phases running too slowly relative to allocation rate. Fix by lowering -XX:CMSInitiatingOccupancyFraction or switching to G1.
G1 tracks the amount of work needed to collect each region and estimates how long it will take. It builds a collection set of regions to collect, ordered by garbage density (garbage-first). It stops adding regions to the set once it estimates the collection will exceed the pause target. This means under memory pressure, G1 may not be able to collect enough regions and may not meet the target - it is a soft goal, not a guarantee.
Objects larger than half a G1 region size go into humongous regions. A 64MB region holds objects up to 32MB; anything bigger gets its own humongous region. These objects are collected during the concurrent marking cycle and cleanup, not during normal young or mixed GC. The catch is that large humongous objects can fragment the humongous region space - a known rough edge of G1.
CMS had a fragmentation problem - no compaction meant free memory ended up scattered, eventually causing allocation failures even when total free memory looked fine. It fell apart on large heaps (4GB+). And when memory pressure hit, its fallback to full stop-the-world GC produced multi-second pauses that negated the whole point of using it. G1 solves all three: incremental compaction, scales to large heaps, and a more predictable fallback path.
The collection set (CSet) is the set of regions G1 has chosen to collect during a GC cycle. G1 selects regions with the highest garbage density (most free space relative to live objects) first - hence "Garbage First." Young GC includes young regions; mixed GC includes both young and old regions. The CSet is built by estimating work per region and stopping when the pause target would be exceeded.
Full compaction (like in Parallel GC) slides all live objects in old generation together in one stop-the-world operation that can take seconds on large heaps. G1 compacts incrementally - it evacuates one region at a time during mixed GC. Each evacuation is short, and G1 spreads compaction work across multiple GC cycles. The trade-off is that G1 may take longer to fully compact, but pauses remain bounded and predictable.
Each G1 region maintains a remembered set tracking references from other regions into that region. This adds memory overhead (roughly 10% of heap) but enables G1 to collect regions independently without scanning the entire heap. Without remembered sets, G1 would need to scan the whole heap for cross-region references, losing its incremental nature. The overhead is a worthwhile trade-off for predictable pause times.
Setting MaxGCPauseMillis aggressively (e.g., 50ms) makes G1 collect smaller batches more frequently. This uses more CPU for GC work but keeps pauses short. Your application gets less CPU time, reducing throughput. The target is a soft goal - if G1 cannot meet it within the target, it simply collects less and pauses may exceed the target. Relaxing the target (200-500ms) allows G1 to collect more per cycle, using less CPU and giving more to your application.
Floating garbage is objects that become unreachable during concurrent marking but are not reclaimed until the next cycle. Since concurrent marking runs while the application runs, objects modified during marking may become unreachable but are not caught. CMS has the worst floating garbage because it does not compact and may accumulate garbage across cycles. G1 handles it better through its cleanup phases.
Fixed-size regions (1MB to 32MB) enable G1 to collect arbitrary subsets of the heap without contiguity constraints. Old generation does not need to be one block - it is a collection of old regions. Young generation can grow and shrink dynamically by adding or removing regions. This flexibility lets G1 balance young/old ratio dynamically and do incremental compaction without moving the entire old generation at once.
G1HeapRegionSize must be a power of 2 between 1MB and 32MB. The JVM selects region size based on heap size: smaller heaps get 1MB regions, larger heaps get 2MB, 4MB, 8MB, 16MB, or 32MB. With very large heaps (16GB+), use 16MB or 32MB regions to avoid having millions of regions which increases remembered set overhead and management cost. With small heaps, 1MB regions provide finer-grained collection sets.
Evacuation failure (to-space exhausted or to-space overflow) occurs when G1 tries to copy survivor objects during young or mixed GC but runs out of free regions. The objects that cannot be copied remain in place (not collected), and G1 may have to fall back to a slower path. Fix by increasing -XX:G1ReservePercent (more reserve regions), increasing total heap, or reducing pause target to collect more frequently with less work per cycle.
IHOP controls when G1 starts the concurrent marking cycle. When old generation occupancy exceeds this percentage, G1 initiates a concurrent mark in the background. The default is 45%. Lower values start marking earlier (more CPU overhead, less old gen headroom needed). Higher values delay marking (less CPU overhead, risk that old gen fills before marking completes, triggering a full GC). Tune based on allocation rate - workloads that fill old gen quickly need lower IHOP.
CMS does not compact, so it needs contiguous free space to allocate into. As fragmentation increases, CMS needs more total free memory to find space for new allocations - even though individual fragments are small. G1 compacts incrementally and can reclaim fragmented regions. Under the same memory pressure, CMS will hit concurrent mode failure with less fragmentation than G1 requires for evacuation failure.
G1 performs young GC ( Eden + Survivor regions -> new Survivor/old regions) as a stop-the-world operation using multiple threads. Unlike Parallel GC which treats young generation as fixed spaces, G1 dynamically adjusts the number of young regions based on the pause time target. G1 collects young regions by evacuating live objects to survivor regions or promoting them to old regions. The pause time target controls how many regions are included in each young GC, making pause times more predictable than Parallel GC's all-or-nothing approach.
G1's concurrent marking uses a snapshot-at-the-beginning (SATB) algorithm that marks objects that were live at the start of marking, even if they become unreachable during marking. This allows marking to proceed without slowing down application threads. CMS uses incremental update (or original) that requires re-scanning modifiied objects, which can miss some newly floating garbage. G1's SATB is faster but produces more floating garbage. G1 also integrates marking with young and mixed GC cycles, while CMS runs marking as a separate phase before sweeping.
Mixed GC occurs after concurrent marking completes, when G1 collects a mix of young regions and old regions with low live ratio. During a mixed GC, G1 evacuates live objects from the collection set (which now includes old regions) to other regions, updating references in the process. The number of mixed GC cycles is controlled by parameters. After enough mixed GC cycles, old gen has been cleaned up sufficiently and G1 returns to young-only collections until the next concurrent marking cycle.
Objects larger than half a G1 region are called humongous objects. They are allocated into contiguous humongous regions that are either one region (if they fit in half a region) or multiple regions. Humongous objects are collected during concurrent marking and cleanup, not during normal young or mixed GC. They cannot be moved during GC because moving would require updating references in global data structures. This makes them a fragmentation risk and a collection priority in G1's cleanup phases.
G1 Heap Region Size must be a power of 2 between 1MB and 32MB. The JVM calculates it based on minimum region size of 1MB and maximum of 32MB, with the goal of having 2048 regions (minimum). For heaps under 4GB, 1MB or 2MB regions work well. For heaps between 4GB and 16GB, 4MB or 8MB regions. For heaps above 16GB, 16MB or 32MB regions reduce the remembered set overhead since each region has its own remembered set. Having too many small regions increases bookkeeping overhead.
Further Reading
- G1: The Garbage-First Collector - Original paper from Dehesh, Owens, and Upton on G1 design
- Understanding G1 GC Logs - Red Hat guide to interpreting G1 output
- Java 9 G1 as Default GC - Migration Guide - Practical guide for switching to G1
- CMS to G1 Migration - AWS guidance on moving from deprecated CMS
Conclusion
CMS achieves low-latency through concurrent marking and sweeping, but lacks compaction and is deprecated. G1 replaced CMS as the default by dividing the heap into regions and performing incremental compaction — it collects garbage-first regions to meet pause time targets. CMS is removed in Java 14+; use G1 for low-latency workloads on Java 9-13, and consider ZGC or Shenandoah for strict sub-millisecond SLA requirements.
Category
Related Posts
ZGC and Shenandoah: Ultra-Low Latency Garbage Collectors
How ZGC and Shenandoah achieve sub-millisecond pause times through concurrent operations and load barriers, without stopping your application.
GC Fundamentals: Mark-Compact, Copying, and Mark-Sweep
Understanding the three core garbage collection algorithms - Mark-Sweep, Mark-Compact, and Copying - their mechanics, trade-offs, and when to use each.
JVM GC Tuning: Heap Sizing and Threshold Optimization
Practical strategies for sizing JVM heap, tuning generation ratios, and optimizing GC thresholds to reduce pause times and improve throughput.