Java Flight Recorder: Continuous Monitoring and Diagnostics

Learn how Java Flight Recorder captures low-level diagnostics, profiling data, and continuous monitoring events from the JVM in production environments.

published: reading time: 22 min read author: GeekWorkBench

Java Flight Recorder: Continuous Monitoring and Diagnostics

Java Flight Recorder (JFR) is the JVM’s built-in profiling and diagnostics engine that collects low-level runtime data with minimal performance overhead. Originally a commercial feature in Oracle JDK, it went open source with OpenJDK 11 and became the standard tool for production diagnostics in modern Java applications.

This guide covers how JFR works internally, when to reach for it versus other tools, and how to actually get useful data out of it.

Introduction

Java Flight Recorder (JFR) is the JVM’s built-in profiling and diagnostics engine that collects low-level runtime data with minimal performance overhead. Originally a commercial feature in Oracle JDK, it went open source with OpenJDK 11 and became the standard tool for production diagnostics in modern Java applications. Unlike external profilers that attach to a running process and infer behavior from the outside, JFR lives inside the JVM runtime itself, with access to internal events that external tools simply cannot see. GC pauses, safepoint operations, TLAB allocations, JIT compilation, lock contention—these are all visible to JFR in ways they are not to profilers that sit outside the JVM process.

The value of JFR is that it changes what is practically measurable in production. Traditional profiling imposes significant overhead and often is not safe to run in production environments. JFR runs continuously at 1-2% CPU overhead even with detailed event capture, making it viable for always-on production monitoring. The data it collects—stored in binary .jfr files—can be analyzed afterward with JDK Mission Control (JMC) or the jfr CLI, giving you post-incident evidence that answers questions about what the JVM was actually doing at the moment of failure.

This guide covers how JFR works internally, its event types and collection pipeline, and how to configure recordings for different diagnostic scenarios. You will learn when to use continuous production recording versus targeted profiling runs, how to analyze recordings to find memory leaks, latency spikes, and lock contention, and why the chunked binary format enables streaming analysis while recordings are still in progress. Production failure scenarios show how JFR data has diagnosed issues that other tools missed entirely.

What is Java Flight Recorder?

JFR is a continuous monitoring tool that runs inside the JVM, capturing events about thread execution, memory allocation, garbage collection, lock contention, CPU profiling, and I/O operations. Unlike external profilers that attach to a running process, JFR lives inside the JVM runtime and sees events that external tools simply cannot access.

JFR stores data in binary Flight Recorder Files (.jfr). You analyze these afterward with JDK Mission Control (JMC), the jfr CLI, or your own code using the JFR API.

When to Use Java Flight Recorder

Ideal Use Cases

  • Production diagnostics: Continuous low-overhead profiling without pausing or significantly impacting application performance
  • Root cause analysis: Investigating intermittent performance issues, memory leaks, or latency spikes
  • Capacity planning: Understanding resource utilization patterns over time
  • Compliance and auditing: Maintaining diagnostic records for regulatory requirements
  • Post-incident analysis: Capturing evidence from production incidents for later review

When NOT to Use Java Flight Recorder

  • Real-time debugging: If you need to step through code or pause execution, use a debugger instead
  • Microsecond-level tracing: JFR has inherent overhead and sampling biases that make it unsuitable for ultra-precise measurements
  • Memory-constrained environments: While overhead is low, storing large recordings can consume heap and disk space
  • Short-lived applications: JFR works best with long-running services; cold start profiling yields limited data

Architecture

JFR consists of several interconnected components within the JVM:

graph TB
    subgraph "JVM Runtime"
        JE[JFR Engine]
        JS[JFR Settings]
        JC[JFR Configuration]
    end

    subgraph "Event Types"
        EM[Event Museum]
        EC[Event Categories]
        ET[Event Types<br/>JDK, JVM, Application]
    end

    subgraph "Collection Pipeline"
        EC -->|Filter & Process| JE
        JE -->|Buffer| JB[JFR Buffers]
        JB -->|Chunk| JC
    end

    subgraph "Output"
        JC -->|Write| JF[JFR File .jfr]
        JC -->|Streaming| JD[JDK Mission Control]
        JC -->|Streaming| JJ[JFR API Consumer]
    end

    subgraph "Configuration"
        JS -->|Settings| JE
        CP[Control Plane<br/>JMX, CLI, API] -->|Configure| JS
    end

Core Components

Event Museum: A catalog of every event type the JVM knows about. Each has predefined fields and metadata.

JFR Engine: The collection and processing core. It receives events, applies your configuration filters, and manages the circular buffer.

JFR Buffers: Thread-local buffers that batch events before consolidation. This keeps threads from contending with each other during recording.

Flight Recorder File: The output format. Uses a chunked binary structure that supports streaming reads, so you can start analyzing before the recording finishes.

Control Plane: How you talk to JFR—via JMX, command-line flags, or the FlightRecorderMXBean API.

Event Types

JFR organizes events into categories:

CategoryExamplesOverhead
Garbage CollectionGC Pause, Young/GOld collection, reference processingLow
ProfilingCPU load, method profiling, allocation in TLABLow-Medium
MemoryHeap memory, TLAB allocations, object statisticsLow
ThreadingThread start/stop, lock profiling, context switchLow
I/OSocket read/write, file I/OMedium
ExceptionsException thrown, error countLow
Code CacheJIT compilation, deoptimizationLow
Language & RuntimeClass loading, VM operation, SafepointLow

Implementation

Starting a Recording via Command Line

# Start with default settings (continuous recording)
java -XX:StartFlightRecording=filename=myapp.jfr myapp.jar

# Start with predefined profile settings
java -XX:StartFlightRecording=settings=profile,filename=myapp.jfr,duration=60s myapp.jar

# Dump recording on shutdown
java -XX:FlightRecorderOptions=stackdepth=256 \
     -XX:StartFlightRecording=filename=myapp.jfr,maxsize=100m \
     -XX:FlightRecorderDumpOnExit=true \
     myapp.jar

Programmatic Control via JFR API

import jdk.jfr.Recording;
import jdk.jfr.RecordingState;
import jdk.jfr.FlightRecorder;

public class JfrController {
    public void startRecording(String outputPath, Duration duration) throws Exception {
        Recording recording = new Recording();

        // Enable specific events
        recording.enable("jdk.CPULoad");
        recording.enable("jdk.GarbageCollection");
        recording.enable("jdk.ThreadPark");
        recording.enable("jdk.ObjectAllocationInTLAB");

        // Set recording duration
        recording.scheduleEnd(duration);

        // Start recording
        recording.start();

        System.out.println("Recording started: " + recording.getId());

        // Dump recording to file
        recording.dump(Path.of(outputPath));

        // Stop when done
        recording.stop();
        recording.close();
    }

    public void continuousRecording(String outputPath) throws Exception {
        Recording recording = new Recording();
        recording.enable("jdk.GarbageCollection");
        recording.enable("jdk.CPULoad");
        recording.enable("jdk.ThreadDump");

        // Write to rotating file (100MB max, 10 files max)
        recording.setMaxSize(100 * 1024 * 1024);
        recording.setMaxAge(Duration.ofHours(10));

        recording.start();

        // Simulate continuous operation
        Thread.sleep(60000);

        recording.stop();
        recording.close();
    }
}

Dynamic Event Enabling

One of JFR’s strengths is the ability to enable events dynamically without restarting the JVM:

import jdk.jfr.*;

public class DynamicJfr {
    public void enableEventsOnDemand() throws Exception {
        // Check if Flight Recorder is available
        if (!FlightRecorder.getFlightRecorder().isAvailable()) {
            throw new IllegalStateException("Flight Recorder not available");
        }

        // Enable a specific event
        EventType eventType = EventType.getEventType("jdk.ObjectAllocationOutsideTLAB");
        System.out.println("Event ID: " + eventType.getId());

        // Create a one-time event
        Configuration profile = Configuration.getConfiguration("profile");
        System.out.println("Profile settings: " + profile.getSettings());
    }
}

JMX Control via FlightRecorderMXBean

import javax.management.*;

public class JmxJfrControl {
    public void controlViaJMX() throws Exception {
        MBeanServer mbs = ManagementFactory.getPlatformMBeanServer();

        // Get the FlightRecorder MXBean
        ObjectName recorderName = new ObjectName("jdk.management.jfr:type=FlightRecorder");
        FlightRecorderMXBean recorder = JMX.newMXBeanProxy(mbs, recorderName, FlightRecorderMXBean.class);

        // List available recordings
        List<RecordingInfo> recordings = recorder.getRecordings();
        System.out.println("Active recordings: " + recordings.size());

        // Get recording options
        Map<String, String> options = recorder.getRecordingOptions();
        System.out.println("Buffer size: " + options.get("bufferSize"));

        // Dump a recording
        recorder.dumpSnapshot("emergency_dump.jfr");
    }
}

Production Failure Scenarios

Scenario 1: Memory Leak in Old Generation

Symptom: Application gradually slows down, GC pauses increase over time, eventually OOM.

JFR Investigation:

# Analyze GC pauses
jfr print --events GCGarbageCollection myapp.jfr | head -50

# Check object allocation patterns
jfr print --events ObjectAllocationInTLAB,ObjectAllocationOutsideTLAB \
     --category Memory myapp.jfr

# Find largest objects
jfr print --events ObjectStatistics myapp.jfr | sort -k5 -nr | head

What JFR showed: A session cache was stuffing ConcurrentHashMap entries in without ever removing them. 50GB accumulated over two weeks.

Scenario 2: Latency Spikes from Safepoint Pauses

Symptom: P99 latency spikes every few minutes, but GC logs look clean.

JFR Investigation:

# Check VM Operation events (includes safepoint)
jfr print --events VMOperation \
     --json myapp.jfr | jq '.records[] | select(.type=="safepoint")'

# Examine thread states during pauses
jfr print --events ThreadDump myapp.jfr | grep -A5 "parked"

What JFR showed: Someone called Thread.yield() inside a hot loop during a bulk import, and a JIT compilation burst was triggering stop-the-world safepoints at the worst possible times.

Scenario 3: Lock Contention in Request Pipeline

Symptom: Throughput plateaus under load even though CPU is not maxed out.

JFR Investigation:

# Analyze lock contention
jfr print --events JavaMonitorEnter,ObjectMonitorEnter \
     --events java.lang.Threading \
     myapp.jfr | grep -E "(address|class|duration)"

# Check for excessive park events
jfr print --events ThreadPark --events Blocked \
     myapp.jfr

What JFR showed: One lock protecting a rate limiter was serializing the entire request path. Every thread hit it and waited.

Scenario 4: I/O Bottleneck Masking as CPU Issue

Symptom: High CPU utilization, but threads look idle under profiler.

JFR Investigation:

# Check socket and file I/O events
jfr print --events SocketRead,SocketWrite \
     --events FileRead,FileWrite myapp.jfr

# Correlate with network latency
jfr print --events SocketRead \
     --json myapp.jfr | jq '[.records[] | {duration: .duration, endTime: .endTime}] | sort_by(.duration) | reverse | .[0:10]'

What JFR showed: The database connection pool was exhausted. Threads were blocking on network I/O waiting for connections, which looked like “running” in CPU profiles.

Trade-off Table

AspectDefault RecordingProfiling RecordingNotes
Overhead1-2%3-8%Varies by event configuration
Storage~10MB/hour~50-200MB/hourDepends on event count
Detail LevelEssential events onlyFull method profilingConfigurable
RetentionRolling 1-2 hoursOn-demand dumpsDisk permitting
Production SafeYesConditionalMonitor resource usage
Continuous ModeYesNoDuration-based better for profiling
Event DetailAggregatedPer-eventImpacts storage significantly

Observability Checklist

  • Enable JFR in production with -XX:StartFlightRecording=...
  • Configure disk space limits with maxsize and maxage
  • Set up automated recording rotation
  • Enable GC events for memory analysis
  • Enable jdk.GarbageCollection with cause and pause data
  • Enable jdk.ObjectAllocationInTLAB for allocation profiling
  • Enable jdk.CPULoad for resource monitoring
  • Enable jdk.ThreadDump for deadlock detection
  • Configure recording to dump on OOM with -XX:FlightRecorderDumpOnExit
  • Set up external storage for JFR files (off-node)
  • Integrate JFR analysis into incident response runbooks
  • Train team on JMC (JDK Mission Control) usage
  • Create alerts for recording failures or disk space issues
  • Document known-good baseline recordings

Security Notes

JFR recordings contain sensitive runtime information that requires careful handling:

Information Exposure Risks:

  • Full class names and method signatures (may reveal business logic)
  • Object sizes and allocation rates (may reveal data structures)
  • Thread names and stack traces (may reveal architecture)
  • Exception messages and stack traces (may reveal vulnerabilities)

Security Best Practices:

  • Store JFR files in secure locations with appropriate access controls
  • Clean sensitive data before sharing recordings externally
  • Avoid recording in untrusted multi-tenant environments
  • Consider encrypting JFR files at rest
  • Restrict JMX access to FlightRecorderMXBean
  • Rotate recordings frequently to limit exposure window
  • Audit access to JFR files and configuration

Secure Configuration:

# Restrict recording access
-XX:+RestrictFlightRecorder

# Disable dynamic child events (reduces data exposure)
-XX:FlightRecorderOptions=nodynamicchildevents

Common Pitfalls / Anti-Patterns

Pitfall 1: Running Without Adequate Disk Space

Problem: JFR will fill your disk if you forget maxsize.

Solution:

# Always set both limits
-XX:StartFlightRecording=maxsize=500m,maxage=2h,filename=app.jfr

Pitfall 2: Misunderstanding Sampling vs. Exact Events

Problem: You think all events are captured equally, but most are sampled.

What’s actually happening: Profiling events use statistical sampling—JFR checks every so often and records a subset. Low-frequency events might not appear at all. Exact events like GC pauses and exceptions are always recorded because the JVM has explicit hooks for them.

Solution: Run recordings long enough to capture representative samples. For rare events, increase duration or use exact events where available.

Pitfall 3: Ignoring Recording Overhead

Problem: Enable too many events and you get measurable performance drag.

Solution: Start with the default or “profile” preset. Add specific events one at a time while watching overhead.

Pitfall 4: Not Correlating with External Metrics

Problem: JFR data alone sometimes sends you down the wrong path.

Solution: Always cross-reference with system-level metrics. A CPU spike in JFR might actually be disk I/O saturation at the OS level.

Pitfall 5: Using Outdated JMC Versions

Problem: JFR file format evolves. Old JMC versions can fail to parse new recordings.

Solution: Use JMC from the same JDK version or newer than the JVM that created the recording.

Quick Recap Checklist

  • JFR provides low-overhead continuous profiling built into the JVM
  • Enable via -XX:StartFlightRecorder or programmatically via JFR API
  • Configure maxsize and maxage to prevent disk exhaustion
  • Use dumpOnExit=true for post-incident evidence capture
  • Analyze recordings with JDK Mission Control or jfr CLI
  • Enable events strategically—more events = more overhead
  • Store recordings securely; they contain sensitive runtime data
  • Correlate JFR findings with external metrics for complete picture
  • Use dynamic event enabling for targeted investigations
  • Train team on common analysis patterns before incidents occur

Interview Questions

1. What is Java Flight Recorder and how does it differ from external profilers?

Java Flight Recorder is a profiling tool built directly into the JVM that continuously collects low-level runtime events. Unlike external profilers that attach to a running process from outside, JFR runs inside the JVM with access to internal events like GC pauses, safepoint operations, TLAB allocations, and JIT compilation. This gives JFR visibility that external tools cannot achieve. Additionally, JFR is designed for minimal overhead (typically 1-2%) making it suitable for production use, whereas external profilers often impose higher overhead and may not be safe for production environments.

2. How do you configure JFR for continuous production monitoring?

For continuous production monitoring, configure JFR with rolling recordings using both size and age limits. Use settings like -XX:StartFlightRecording=filename=app.jfr,maxsize=500m,maxage=2h to prevent disk exhaustion. Enable essential events only (GC, CPU, memory) to keep overhead low. Set dumpOnExit=true to capture recordings on JVM shutdown or OOM. Consider using a separate disk for JFR output to avoid impacting application I/O. Integrate with monitoring to alert if recordings fail or disk space becomes constrained.

3. What are the security considerations when using JFR in production?

JFR recordings expose sensitive runtime information including class names, method signatures, object allocation patterns, thread names, and exception stack traces. This data can reveal business logic, architecture details, and potential vulnerabilities. Security considerations include: storing recordings with strict access controls, restricting JMX access to FlightRecorderMXBean, enabling +RestrictFlightRecorder in multi-tenant environments, rotating recordings frequently, auditing access to JFR files, and potentially encrypting recordings at rest. Never record in untrusted environments where adversaries might access the recordings.

4. How do you investigate a memory leak using JFR?

To investigate a memory leak with JFR, first enable allocation profiling events (ObjectAllocationInTLAB, ObjectAllocationOutsideTLAB) and GC events. Run the recording while reproducing the leak scenario. Then analyze in JMC: check the "Memory" view for allocation rate trends, use the "Object Statistics" page to identify unexpectedly large or growing objects, and look for collections (Maps, Lists) that accumulate entries without removal. Compare recordings from different time periods to see allocation patterns that correlate with the leak. The "Continuous" allocation sample in JMC shows live objects on the heap that can help identify retention chains.

5. How does JFR's chunked binary format enable streaming analysis?

JFR files use a chunked binary structure where each chunk contains a complete set of metadata and events. This design allows tools to read the file incrementally without parsing the entire file first. When you start a recording and point JMC at a live .jfr file, it reads completed chunks while new chunks are being written. This means you can begin analyzing data before the recording ends. The chunk size is configurable via -XX:FlightRecorderOptions=chunkSize=... and affects how frequently you can see new data during streaming analysis.

6. What is the difference between a dump and a snapshot in JFR terminology?

A snapshot is a point-in-time copy of the current recording's data — it captures everything in the recording buffer at that moment but discards data that was there before. A dump (via dumpSnapshot) copies all events that have been written to disk so far, including all historically recorded data. When you call recording.dump(Path.of("output.jfr")), it dumps the complete recording to a file. This distinction matters when investigating incidents: a snapshot after the fact captures only what was still in the buffer, while a proper recording with dump-on-exit captures everything.

7. How do you correlate JFR data with external monitoring systems like Prometheus?

JFR does not directly export to Prometheus, but there are two common integration patterns. First, use the JFR-to-Metrics sidecar project (JFR metrics exporter) which reads JFR events and exposes them as Prometheus metrics via HTTP. Second, use JMX Exporter alongside JFR — JMX gives you real-time numeric gauges (heap used, thread count) while JFR gives you event-level diagnostics. For correlated analysis, export both to the same time-series database and join on timestamp. The key is to align on wall-clock time — record the start/end timestamps of your JFR recording and use those when correlating in your monitoring system.

8. What JVM flags control JFR's buffer size and how do they affect overhead?

Key buffer-related flags: -XX:FlightRecorderOptions=bufferSize=... sets the JFR buffer size (default varies by JVM, typically 1-2MB per thread). Larger buffers reduce the chance of dropping events during bursts but increase memory overhead. -XX:FlightRecorderOptions=numBuffers=... controls the number of buffers (default is based on CPU count). maxchunksize=... controls the size of each chunk in the output file. Increasing buffer size helps when you have many short-lived threads that generate bursty event streams, but the overhead is linear in the number of buffers.

9. How does JFR handle events from the VM operation safepoint phase?

JFR has specific events for VM operations that happen at safepoints, including ExecuteVMOperation which records the type of operation (GC, JIT compilation, biased locking revocation, etc.) and its duration. During safepoint, the JVM threads are stopped so JFR can safely record thread state. The safepoint event specifically records when a safepoint began and ended. This is critical for distinguishing true GC pauses from other safepoint operations like JIT deoptimization or class unloading — all of which appear as "GC pauses" in basic GC logs but are separable in JFR.

10. Can JFR record data from multiple recordings simultaneously?

Yes, multiple concurrent recordings are supported. You might run one continuous low-overhead recording with essential events (GC, CPU) and a separate on-demand recording with full profiling enabled triggered by an alert. Each recording has its own ID, settings, and output file. Use FlightRecorderMXBean.getRecordings() to list all active recordings. The key constraint is that each enabled event fires for every active recording that has that event enabled — so enabling many events on many concurrent recordings multiplies overhead.

11. What is the JFR Event Streaming API and when should you prefer it over file-based analysis?

JFR Event Streaming (jdk.jfr.consumer API) lets you consume JFR events in real-time as they are emitted, without waiting for a file to close. You register a callback or use a RecordingFile on a live file. Use streaming when you need to: detect and respond to events in real-time (e.g., alerting on a specific GC pattern), build live dashboards, or feed events into a stream processing system. The trade-off is that streaming requires more complex code than post-hoc file analysis, and you need to handle backpressure if events arrive faster than you can process them.

12. How does JFR's "continuous" vs "profile" configuration preset differ in event detail?

The default (formerly "continuous") preset enables essential JVM events with minimal overhead (~1%), suitable for always-on production. The profile preset enables many more events including method profiling, detailed allocation sampling, and context switch data, with higher overhead (3-8%). The profile preset also enables recording of call stacks for CPU events, which the default preset does not. You can create custom configurations by copying a preset and modifying specific event settings via JMC or the FlightRecorderMXBean API.

13. How do you diagnose a JFR recording that fails to start or produces an empty file?

Common causes: disk space exhaustion (JFR writes a header immediately then waits for events), wrong file path permissions, and using settings=none which disables all events. Check jcmd JFR.check or the FlightRecorderMXBean for recording status. Also verify the JVM is not running with -XX:+FlightRecorder explicitly disabled. If using dynamic event enabling, the recording must be in a started state first. The jfr print command on the file can reveal if it has any events at all.

14. What is the relationship between JFR and JDK Mission Control (JMC)?

JMC is the primary GUI tool for analyzing JFR recordings. It reads .jfr files and renders the event data as visualizations: flame graphs for CPU, memory leak suspects, GC analysis, thread latencies, and more. JMC and JFR are developed together and tightly coupled — JMC understands the JFR event schema, metadata, and event types. Using a JMC version older than the JVM that created the recording can cause parsing failures or missing event categories.

15. How does JFR handle sensitive data in recordings and what options exist for redaction?

JFR records raw string values including exception messages and HTTP request parameters if those events are enabled. There is no automatic redaction. To reduce sensitive data exposure: enable -XX:FlightRecorderOptions=nodynamicchildevents to suppress dynamically generated child events that might contain application data. Carefully choose which events to enable — avoid enabling events that capture method arguments or return values in production. Store recordings with filesystem ACLs and encrypt at rest.

16. How does JFR work in a containerized environment with memory limits?

JFR is container-aware in modern JVMs. When running in a container with memory limits, JFR accounts for container memory when calculating default buffer sizes and recording limits. Configure appropriate maxsize for both the recording and the overall disk space since container filesystems may have stricter quotas. On Kubernetes, mount a volume for JFR output and set -XX:FlightRecorderOptions=repository=/mnt/jfr for rotating recordings. JFR auto-detects container CPU and memory limits via cgroups for proper sizing.

17. What is the difference between jcmd, jfr CLI, and FlightRecorderMXBean for controlling JFR?

jcmd is the universal JVM diagnostic command tool — it sends commands to a running JVM process, including JFR-related commands like jcmd JFR.start, JFR.dump, and JFR.check. The jfr CLI is a dedicated tool for analyzing .jfr files with subcommands like print, summarize, and translate. FlightRecorderMXBean is the programmatic JMX API for controlling recordings from within Java code. The jfr CLI only analyzes existing files, while jcmd and MXBean control live recordings.

18. How do you measure JFR overhead itself to validate it is within acceptable bounds?

Start with baseline measurements: run your workload without JFR and measure throughput (requests/second), latency (P50/P95/P99), and CPU. Then enable JFR with your intended settings and measure the same metrics. The overhead percentage is ((baseline - with_jfr) / baseline) * 100. Typically aim for under 5% overhead in production. If overhead is higher than expected, reduce the event set: disable method profiling events if not needed, raise the sampling interval for allocation profiling.

19. What is the JFR checkpoint mechanism and why does it matter for recordings?

A JFR checkpoint is a synchronization point where the JFR engine writes metadata (class definitions, event metadata, string constants) into the recording file. Checkpoints are necessary so that a .jfr file remains readable even if the JVM crashes mid-recording — the metadata is written redundantly so parsers can reconstruct the file structure. Checkpoints happen at regular intervals (controlled by the checkpointinterval option) and on recording stop. Between checkpoints, event metadata is held in memory.

20. What are the different JFR event type categories and which events have the highest overhead in production?

JFR events are categorized into: Garbage Collection (low overhead), Profiling (CPU, method profiling, medium overhead), Memory (TLAB allocations, low-medium), Threading (lock profiling, context switches, low), I/O (socket/file, medium overhead), Exceptions (thrown exceptions, low), Code Cache (JIT compilation, deoptimization, low), and Language & Runtime (class loading, VM operations, safepoints, low). Highest overhead events in production are typically ClassLoad with full stack traces, ExceptionThrow with stack traces, and AllocationInTLAB with very low sampling intervals. For production use, stick to essential events or use the default preset which balances coverage with overhead.

Further Reading

Conclusion

Java Flight Recorder provides production-safe continuous monitoring with minimal overhead. Enable it via -XX:StartFlightRecording with maxsize and maxage settings to prevent disk exhaustion. Use JMC to analyze recordings and identify issues ranging from memory leaks to lock contention. The built-in nature of JFR makes it the first tool to reach for when diagnosing JVM performance issues in production.

Category

Related Posts

JMX and MXBeans: JVM Hotspot Diagnostics and Custom MBeans

Learn how to use JMX and MXBeans to monitor JVM memory pools, perform hotspot diagnostics, and build custom MBeans for production observability.

#jvm #jmx #mxbeans

JVM Flags Explained: Standard, -X, and -XX Options for Tuning

Master JVM flags configuration with this comprehensive guide covering standard, -X, and -XX options for production Java performance tuning.

#jvm #performance #tuning

JVM Stack Walking API: Fast Stack Traversal and Security Context

A guide to the JVM Stack Walking API showing how to efficiently traverse stack frames, access local variables, and extract security context without the overhead of traditional stack trace capture.

#java #jvm #stack-walking