This document describes how to configure and use async profiler with Hadoop applications. Async profiler is a low overhead sampling profiler for Java that does not suffer from Safepoint bias problem. It features HotSpot-specific APIs to collect stack traces and to track memory allocations. The profiler works with OpenJDK, Oracle JDK and other Java runtimes based on the HotSpot JVM.
Hadoop profiler servlet supports Async Profiler major versions 1.x and 2.x.
Make sure Hadoop is installed, configured and setup correctly. For more information see:
Go to https://github.com/jvm-profiling-tools/async-profiler, download a release appropriate for your platform, and install on every cluster host.
Set ASYNC_PROFILER_HOME in the environment (put it in hadoop-env.sh) to the root directory of the async-profiler install location, or pass it on the Hadoop daemon’s command line as a system property as -Dasync.profiler.home=/path/to/async-profiler.
Once the prerequisites have been satisfied, access to the async-profiler is available by using Namenode or ResourceManager UI.
Following options from async-profiler can be specified as query paramater. * -e event profiling event: cpu|alloc|lock|cache-misses etc. * -d duration run profiling for ‘duration’ seconds (integer) * -i interval sampling interval in nanoseconds (long) * -j jstackdepth maximum Java stack depth (integer) * -b bufsize frame buffer size (long) * -t profile different threads separately * -s simple class names instead of FQN * -o fmt[,fmt...] output format: summary|traces|flat|collapsed|svg|tree|jfr|html * --width px SVG width pixels (integer) * --height px SVG frame height pixels (integer) * --minwidth px skip frames smaller than px (double) * --reverse generate stack-reversed FlameGraph / Call tree
Example: If Namenode http address is localhost:9870, and ResourceManager http address is localhost:8088, ProfileServlet running with async-profiler setup can be accessed with http://localhost:9870/prof and http://localhost:8088/prof for Namenode and ResourceManager processes respectively.
Diving deep into some params:
curl http://localhost:9870/prof (FlameGraph svg for Namenode)curl http://localhost:8088/prof (FlameGraph svg for ResourceManager)curl http://localhost:9870/prof?pid=12345 (For instance, provide pid of Datanode here)curl http://localhost:9870/prof?pid=12345&duration=30curl http://localhost:9870/prof?output=tree&duration=60curl http://localhost:9870/prof?event=alloccurl http://localhost:9870/prof?event=lockThe following event types are supported by async-profiler. Use the ‘event’ parameter to specify. Default is ‘cpu’. Not all operating systems will support all types.
Perf events:
Java events:
The following output formats are supported. Use the ‘output’ parameter to specify. Default is ‘flamegraph’.
Output formats:
The ‘duration’ parameter specifies how long to collect trace data before generating output, specified in seconds. The default is 10 seconds.