JVM GC Tuning: Heap Sizing and Threshold Optimization
Practical strategies for sizing JVM heap, tuning generation ratios, and optimizing GC thresholds to reduce pause times and improve throughput.
JVM GC Tuning: Heap Sizing and Threshold Optimization
G1 is usually the right collector for most workloads. Leave it running with defaults and you are leaving performance on the table. The tuning work is about matching the JVM’s knobs to what your application actually does. Done right, a few flag changes can cut pause times in half.
This covers the heap sizing decisions that matter, the generation ratio knobs, the threshold settings that control when GC fires, and how to read the signals your application is sending through GC logs.
Introduction
G1 is the right collector for most workloads, but leaving it running with defaults means you are leaving performance on the table. The JVM’s GC tuning knobs are numerous, but only a few of them interact with your application’s actual behavior in ways that meaningfully impact pause times and throughput. Done right, a few well-chosen flag changes can cut pause times in half and reduce the percentage of time your application spends in GC.
GC tuning is fundamentally about matching the JVM’s knobs to what your application actually does. The heap must be large enough to hold your live set plus working memory during peak allocation bursts, but not so large that Full GC pause times become unacceptable. The generation ratios must match your object’s lifetime distribution — too much heap in young gen wastes memory on objects that die young, while too little causes premature promotion that floods old gen. These decisions are driven by data from GC logs and profiling, not by intuition.
This guide covers the heap sizing decisions that matter most, the generation ratio knobs that control promotion rates, the threshold settings that determine when GC fires, and how to read the signals your application sends through GC logs. You will learn practical tuning patterns for throughput-focused batch workloads, latency-sensitive API servers, and ultra-low-latency trading systems, along with the pitfall patterns that catch teams who tune without measuring.
When to Use This Knowledge
Use when:
- GC pauses are exceeding your SLA targets despite using G1 or ZGC
- Your application is spending more than 10% of time in GC
- You are tuning for a specific workload profile (batch vs latency-sensitive)
- You need to right-size a JVM deployment in a containerized environment
Do not use when:
- Throughput is the only metric and pauses are acceptable (Parallel GC may be simpler)
- You have not profiled your application to understand allocation rate first
- Your heap is undersized to the point where GC fires constantly regardless of tuning
When NOT to Use
GC tuning is not always worth the effort. For small heaps under 2GB, JVM defaults often work well enough that aggressive tuning produces marginal returns. A 512MB heap with default G1 settings might see 20-50ms pauses, acceptable for many batch jobs and internal services.
Simple workloads like REST APIs that allocate mostly short-lived objects, or CRUD applications with predictable traffic patterns, rarely need deep GC tuning. If your application allocates cleanly, dies young, and pauses stay under 200ms, defaults are probably fine. Tuning knobs solve problems. If you do not have the problems, tuning just adds complexity.
Premature optimization applies here like everywhere else. Spending two days tuning GC when your real bottleneck is a missing database index is a poor trade. Profile and measure before assuming GC is the issue. If jstat shows GC time under 5% of total runtime, tuning will not move the needle.
Heap Sizing Fundamentals
Setting heap size is the most important GC decision you make. Too small and you GC constantly. Too large and you waste memory or cause long Full GC pauses.
The Heap Size Equation
Your heap needs to comfortably hold your live set plus working memory during peak load, with enough headroom that GC does not fire before your request completes.
Heap Required = Live Set + (Allocation Rate × GC Interval) + Safety Margin
If your live set is 2GB and you allocate 500MB per second with a 5-second GC interval, you need at least 2GB + 2.5GB = 4.5GB minimum before accounting for safety margin.
Initial vs Maximum Heap
-Xms sets the initial heap size. -Xmx sets the maximum. The JVM can grow and shrink between these bounds if adaptive sizing is enabled.
-Xms4g -Xmx4g # Fixed heap - no resizing
-Xms4g -Xmx8g # Grows as needed up to 8GB
Rule of thumb: Set -Xms and -Xmx to the same value in production. Heap resizing creates GC overhead and introduces unpredictable pause spikes.
Sizing for Containers
Container memory limits interact with JVM heap in ways that bite a lot of people.
// BAD: JVM does not know about container memory limits
docker run --memory=4g java -jar app.jar
// GOOD: Leave headroom for metaspace, native memory, and direct buffers
docker run --memory=4g java -Xmx3500m -Xms3500m -jar app.jar
Do not set -Xmx to your container limit. The OS, Metaspace, thread stacks, and direct buffers all need memory outside the Java heap. Leave 10-15% for non-heap memory.
Generation Ratio Tuning
The heap is split between young and old generations. Getting this ratio wrong causes either premature promotion (flooding old gen with short-lived objects) or too much time spent in minor GC (young gen too small).
Young Generation Sizing
-Xmn sets the young generation size directly. Alternatively, -XX:NewRatio=2 means old gen is 2x young gen (young = 1/3 of heap).
-Xmn1536m # Set young gen to 1.5GB directly
-XX:NewRatio=2 # Old gen is 2x young gen (heap / 3 for young)
Survivor Space Tuning
Objects age between Survivor spaces (S0 and S1) before promotion. The key knobs:
-XX:SurvivorRatio=8 # Eden/Survivor ratio. Default 8 means Eden=8/10 of young gen
-XX:MaxTenuringThreshold=15 # Max age before promotion to old gen
Common misconfigs:
| Misconfig | What happens | Fix |
|---|---|---|
| SurvivorRatio too high | Survivor spaces too small, objects promote prematurely | Lower to 4 or 6 |
| SurvivorRatio too low | Too much memory in survivors, less Eden | Raise to 8-10 |
| MaxTenuringThreshold too high | Objects linger in young gen too long | Lower to 10-12 |
| MaxTenuringThreshold too low | Objects flood old gen | Raise to 15 |
Tenuring Threshold Dynamics
The JVM adaptively adjusts the tenuring threshold based on allocation behavior with -XX:+UseAdaptiveSizePolicy (on by default with G1 and Parallel GC).
// Disable adaptive sizing if it causes instability
-XX:-UseAdaptiveSizePolicy
// Then set explicit ratios
-XX:NewRatio=2
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=15
The risk with adaptive sizing in production is that heap resizing events can trigger during critical periods, causing GC spikes you cannot predict.
GC Threshold Tuning
Different collectors and configurations trigger GC at different occupancy thresholds. Understanding these triggers helps you tune proactively rather than reactively.
G1 Heap Occupancy Threshold
G1 starts a concurrent GC cycle when the heap reaches a certain occupancy percentage.
-XX:InitiatingHeapOccupancyPercent=45 # Start GC when heap 45% full (default: 40)
Lower values start GC more eagerly (more CPU, fewer pauses). Higher values delay GC (less CPU, risk of longer pauses when it finally fires).
Parallel GC Full GC Threshold
Parallel GC triggers Full GC when old gen cannot accommodate a promotion.
-XX:OldGenSize=2g # Set explicit old gen size (only with fixed heap)
ZGC Allocation Stall Threshold
ZGC has a concept of allocation stalls - your application waits when there is no free memory fast enough.
-XX:ZCollectionInterval=120 # Target GC every 120 seconds
-XX:+ZProactive # Enable proactive GC (default on)
-XX:+ZProactive tells ZGC to run GC cycles before memory runs out, which is usually what you want for low-latency workloads.
Shenandoah Heuristics
Shenandoah uses heuristics to decide when to run GC.
-XX:ShenandoahGCHeuristics=adaptive # Default - adapts to workload
-XX:ShenandoahGCHeuristics=static # Fixed schedule
-XX:ShenandoahGCHeuristics=compact # More aggressive compaction
The adaptive heuristic usually works best, but compact can help if fragmentation is causing allocation failures.
Collector-Specific Tuning
G1 Tuneables
-XX:+UseG1GC
-XX:MaxGCPauseMillis=200 # Target max pause (soft goal)
-XX:G1HeapRegionSize=16m # Region size (1, 2, 4, 8, 16, 32 MB)
-XX:G1ReservePercent=15 # Reserve space for promotion failures
Setting MaxGCPauseMillis too aggressively can backfire. G1 will use more CPU trying to meet the target, which can reduce throughput in CPU-bound workloads.
Parallel GC Tuneables
-XX:+UseParallelGC
-XX:+UseParallelOldGC # Parallel old gen (usually enabled together)
-XX:ParallelGCThreads=16 # Explicit GC thread count
-XX:+UseAdaptiveSizePolicy # Auto-tune heap sizes
On a 32-core machine, the JVM defaults to 31 GC threads which is often too many. Leave 1-2 cores for application threads.
Production Tuning Patterns
Pattern 1: Throughput-Focused Batch
For batch jobs where pause time does not matter but overall time does:
java -XX:+UseParallelGC \
-XX:+UseParallelOldGC \
-Xms8g -Xmx8g \
-XX:ParallelGCThreads=16 \
-XX:+UseAdaptiveSizePolicy \
-Xlog:gc*:file=gc.log \
-jar batch-app.jar
Key: Large fixed heap, parallel threads, adaptive sizing OK for batch.
Pattern 2: Low-Latency API Server
For latency-sensitive services where pause spikes matter:
java -XX:+UseG1GC \
-Xms4g -Xmx4g \
-XX:MaxGCPauseMillis=100 \
-XX:G1HeapRegionSize=8m \
-XX:InitiatingHeapOccupancyPercent=45 \
-XX:G1ReservePercent=15 \
-Xlog:gc*:file=g1.log \
-jar api-app.jar
Key: G1 with pause target, moderate heap, headroom for evacuation failures.
Pattern 3: Ultra-Low Latency (ZGC)
For sub-millisecond requirements:
java -XX:+UseZGC \
-Xms64g -Xmx64g \
-XX:+ZCollectionInterval=120 \
-XX:+ZProactive \
-Xlog:gc*:file=zgc.log \
-jar trading-app.jar
Trade-off Table
| Configuration | Benefit | Cost |
|---|---|---|
-Xms=-Xmx (fixed heap) | No resizing GC overhead | Wasted memory if overprovisioned |
-XX:NewRatio=2 | Controls promotion rate | May be wrong for your workload |
-XX:SurvivorRatio=4 | Larger survivors, slower promotion | Less Eden space, more minor GC |
-XX:MaxGCPauseMillis=50 | Shorter pause target | More CPU overhead, potentially lower throughput |
-XX:InitiatingHeapOccupancyPercent=30 | Earlier GC start | More concurrent cycles, less old gen usage |
-XX:G1ReservePercent=20 | Reduces evacuation failures | Less memory for allocations |
-XX:ParallelGCThreads=8 | Avoids GC thread contention | Slower compaction if too few threads |
Observability Checklist
- Enable GC logging:
-Xlog:gc*:file=gc.log - Run
jstat -gc <pid>to see heap usage per space (Eden, S0, S1, Old) - Track pause times from logs - look for
user=vsreal=to spot thread contention - Monitor promotion rate: S0/S1 occupancy growth vs Eden allocation rate
- For G1: watch
g1HeapRegionCountand evacuation failure logs - For ZGC: watch allocation stalls in
ZAllocation Stalllog lines - For Shenandoah: try different heuristics and compare pause distributions
- Set up GC metrics in your monitoring system (Prometheus with JMX exporter)
Security Notes
- GC logs reveal allocation patterns - protect them in production environments
- Tuning flags can mask underlying issues - do not just tune away problems without understanding root cause
- Heap dumps triggered after OOM contain full application state - treat as sensitive data
- JMX access to GC MBeans should be restricted - can reveal tuning configuration
Common Pitfalls / Anti-Patterns
| Pitfall | What happens | Fix |
|---|---|---|
-Xmx = container limit | OOMKilled because OS/native need memory | Set -Xmx to 85% of container limit |
-XX:+UseAdaptiveSizePolicy in prod | Unpredictable heap resizing during load | Disable and set explicit ratios |
| Too many GC threads | GC threads fight app threads for CPU | Set -XX:ParallelGCThreads=N explicitly |
| MaxGCPauseMillis too aggressive | G1 uses excessive CPU trying to meet target | Relax to 200-500ms |
| NewRatio wrong for workload | Premature promotion or minor GC thrashing | Profile allocation and age distribution |
| Ignoring Metaspace | Metaspace OOM despite heap headroom | Set -XX:MaxMetaspaceSize if capped |
Quick Recap Checklist
- Set
-Xmsand-Xmxto the same value in production - Leave 10-15% headroom for non-heap memory (metaspace, native, buffers)
- Profile before tuning - know your live set and allocation rate
- SurvivorRatio=4-6 for high allocation rates, 8 for normal workloads
-
-XX:+UseAdaptiveSizePolicyis useful but can cause instability - MaxGCPauseMillis is a target, not a guarantee - do not set too aggressively
- ParallelGCThreads = available cores minus 1-2 for app threads
- G1ReservePercent=15-20 prevents evacuation failures on loaded systems
- ZCollectionInterval + ZProactive keep ZGC running proactively
- Always validate with GC logs after changing tuning parameters
Interview Questions
Start with your live set size (measure with heap dumps at stable load), then add working memory based on your allocation rate times your GC interval. If you do not know your allocation rate, enable GC logging for a few days under normal load and analyze the throughput statistics. A common starting point is 1/4 of available system memory for the heap, but this varies by workload. Batch workloads that hold large data structures need more heap; stateless request-response services can often run with less.
`-Xmn` sets the young generation size directly to a fixed value. `-XX:NewRatio` sets the ratio between old and young (NewRatio=2 means old gen is 2x young gen). `-XX:NewSize` sets the minimum young generation size but allows it to grow within the overall heap. `-Xmn` is the simplest and most predictable for production tuning; use it when you know the right size. NewRatio is convenient when you want heap sizing to scale proportionally with total heap size.
G1 tracks work per region and stops collecting regions once it estimates the pause will exceed the target. When you set this too aggressively (e.g., 50ms instead of 200ms), G1 runs more concurrent cycles and uses more CPU trying to keep pauses short. On CPU-bound workloads, this steals cycles from your application threads and reduces overall throughput. The target is a soft goal, not a guarantee, and treating it as a hard requirement is a common misstep.
Evacuation failure (the "to-space exhausted" message) happens when G1 runs out of free regions to copy survivor objects during a young or mixed GC. Fix by increasing `-XX:G1ReservePercent` (from the default 10 to 15-20), increasing total heap size, or lowering `-XX:MaxGCPauseMillis` to collect more frequently with smaller batches. If evacuation failures are frequent, your old gen is filling faster than G1 can reclaim it — consider whether objects are being promoted prematurely from young gen.
Use ZGC when your pause time SLA is sub-millisecond or when your heap is very large (16GB+). G1's incremental compaction still produces pauses that scale with heap size under memory pressure; ZGC's pauses are consistently sub-millisecond regardless of heap size. The trade-off is ZGC requires Java 11+ and adds 5-15% CPU overhead from its load barrier. ZGC also does not support class unloading with ZGCCleanupPhase, which can be an issue for applications that dynamically load classes. For most workloads under 16GB, G1 with good tuning is sufficient.
SurvivorRatio controls the size of Eden relative to each Survivor space (Eden/Survivor). With SurvivorRatio=8 (default), each Survivor is 1/10 of young gen. Lower values (4-6) mean larger Survivor spaces, giving objects more time to age before promotion. Higher values (10+) mean smaller Survivors, causing premature promotion. If you see high promotion to old gen despite available Survivor space, try lowering SurvivorRatio to 4-6.
G1ReservePercent (default 10) sets aside a portion of heap as a safety margin for promotion. When G1 evacuates objects from one region to another, it needs free target regions. The reserve ensures there are regions available even when heap is nearly full. Without enough reserve, G1 runs out of to-space during evacuation, causing the failure. Increase to 15-20 if you see evacuation failures, but note this reduces memory available for allocations.
GC threads compete with application threads for CPU cores. With too many GC threads (e.g., 31 threads on a 32-core machine), context switching and cache thrashing reduce both GC efficiency and application throughput. A common guideline is to set ParallelGCThreads to (cores - 1) or (cores - 2), leaving headroom for application threads. On hyperthreaded cores, count logical cores conservatively - 16 physical cores with 32 logical may only need 13-14 GC threads.
The JVM does not automatically detect container memory limits. If you set -Xmx to the container limit (e.g., 4GB), the OS, Metaspace, thread stacks, and direct buffers all compete for the same 4GB, causing OOMKilled. Always leave 10-15% headroom: if container limit is 4GB, set -Xmx to 3500m or 3600m. Better yet, use Java 11+ with proper container awareness (-XX:+UseContainerSupport) and let the JVM query cgroups for actual limits.
If GC fires every few seconds even with plenty of free heap, the problem is allocation rate, not total heap size. Your application creates objects faster than minor GC can reclaim them. Either the heap is too small for your allocation burst window, or objects that should die young are surviving to old gen. Use -Xmn to increase young gen, lower SurvivorRatio to give objects more aging time, or optimize your application to reduce allocation rate.
UseAdaptiveSizePolicy (enabled by default) makes the JVM automatically adjust heap region sizes based on allocation behavior. This can find good settings automatically but introduces unpredictability - resizing events cause GC pauses you cannot forecast. For production latency SLAs, disable it with -XX:-UseAdaptiveSizePolicy and set explicit -Xmn, -XX:NewRatio, and -XX:SurvivorRatio. This gives you reproducible behavior and lets you tune based on actual measurements.
Always in production. When -Xms < -Xmx, the JVM grows and shrinks the heap as needed, which triggers GC cycles to manage the resizing. Growing can cause pauses; shrinking triggers a Full GC to consolidate before releasing memory. Both create unpredictable pause spikes. Fixed heap (-Xms=-Xmx) eliminates this overhead and gives you consistent, measurable performance. Only use dynamic sizing during development or when you genuinely need elastic memory.
Metaspace lives in native memory outside the heap. Heap tuning (-Xms, -Xmx, -Xmn) does not affect Metaspace. Applications using bytecode generation (CGLIB, OSGi), JSP containers, or dynamic proxies can accumulate class metadata and exhaust Metaspace. Set -XX:MaxMetaspaceSize as a hard limit if you need to contain native memory. When Metaspace fills, you get OutOfMemoryError: Metaspace - increasing -Xmx does not help.
Look at the ratio of user time to real time for parallel collectors - if user=160 and real=20 on 8 threads, GC scaled well. For G1, examine pause times in the logs: if most pauses are under your MaxGCPauseMillis target, tuning is on track. If evacuation failures appear (to-space exhausted), increase G1ReservePercent or heap. High Full GC frequency with low old gen usage indicates premature promotion - tune SurvivorRatio and MaxTenuringThreshold.
Larger heap means fewer GC cycles but longer pause times when they occur (more objects to process). Smaller heap means more frequent pauses but each one is shorter. For throughput workloads (Parallel GC), larger heap wins because total GC time decreases even with longer individual pauses. For latency workloads (G1, ZGC), smaller heap with more frequent collections keeps pauses predictable and under target. The optimal size balances your SLA requirements against available memory.
For latency-critical workloads, set MaxGCPauseMillis to your SLA target (e.g., 50ms), increase G1ReservePercent to 20 to prevent evacuation failures, set InitiatingHeapOccupancyPercent lower (e.g., 40) to start concurrent marking earlier, and consider using -XX:G1HeapRegionSize=16m or 32m for large heaps. If latency still exceeds target, reduce the amount of work per GC cycle at the cost of more frequent concurrent cycles. Profile with GC logs to confirm pauses are within target and evacuation failures are rare.
For machines with 8 or fewer cores, set ParallelGCThreads to (cores - 1). For machines with more than 8 cores, set it to (cores * 5 / 8) plus 1, or simply leave 1-2 cores for application threads. On a 32-core machine, 30-31 threads is typically too many - use 24-28. On hyperthreaded cores, count physical cores only and leave headroom. Monitor user/real time (e.g., user=160 real=20 on 8 threads means good scaling).
If -Xmn is explicitly set, it takes precedence and NewRatio is ignored for young generation sizing. -Xmn sets young gen directly to the specified size. NewRatio controls the ratio between old and young (old = NewRatio * young). With fixed heap (-Xms=-Xmx), setting -Xmn implicitly determines old gen size. With adaptive sizing enabled, the ratio still applies but JVM can adjust within bounds. Explicit -Xmn is the cleanest approach for production because it eliminates ambiguity.
MaxTenuringThreshold controls how old objects get before promoting to old gen. Higher values (up to 15) let objects age more in Survivor spaces before promotion. This is useful if objects tend to become unreachable after a few minor GCs (typical for many applications). Lower values promote objects earlier, which is useful if Survivor spaces are too small and objects are being promoted due to overflow. When promotion failures occur, consider lowering the threshold to prevent objects with low lifespan from flooding old gen.
Switch when your P99 latency SLA is sub-10ms and G1 cannot consistently meet it, or when your heap is 16GB+ and G1 pause times are growing unacceptable despite tuning. ZGC requires Java 11+; Shenandoah works on Java 8+ (via backport). CPU overhead for both is 5-15% higher than G1 due to load barriers. Before switching, confirm the latency issue is actually GC-caused by analyzing GC logs and JFR data. If the issue is application-level (lock contention, I/O), collector tuning won't help.
Further Reading
- HotSpot GC Tuning Guide - Oracle’s comprehensive JVM GC tuning documentation
- Getting Started with G1 GC - Red Hat’s practical G1 introduction and tuning walkthrough
- JVM GC Flags Reference - Complete reference for GC-related JVM flags
- Analyzing GC Logs with GCViewer - Open source tool for visualizing GC performance data
Conclusion
GC tuning starts with correct heap sizing (live set + allocation rate × GC interval + headroom) and matching the collector to your workload (Parallel for throughput, G1 for balanced, ZGC/Shenandoah for sub-ms latency). Key tuning levers include SurvivorRatio and MaxTenuringThreshold for promotion rate, MaxGCPauseMillis for G1 pause targets, and ParallelGCThreads to avoid CPU contention. Always validate changes against GC logs — the data tells you whether your tuning is working or masking an underlying problem.
Category
Related Posts
CMS and G1 Collectors: Low-Latency Garbage Collection
How CMS and G1 garbage collectors reduce pause times through concurrent marking, region-based heap layout, and incremental compaction.
GC Fundamentals: Mark-Compact, Copying, and Mark-Sweep
Understanding the three core garbage collection algorithms - Mark-Sweep, Mark-Compact, and Copying - their mechanics, trade-offs, and when to use each.
JVM Heap Memory: Young Gen, Old Gen, Metaspace, and Object Headers
A deep dive into JVM heap memory organization including Young Generation, Old Generation, Metaspace, and object header internals for performance optimization.