Skip to content

Profiling Go and Python Computational Nodes

VOR Stream profiling is intended to assist developers in improving code efficiency by helping them understand resource consumption in Go and Python code. That understanding can then be used guide the developer on code optimization.

Code optimization will typically consist of iteration of a three step process. First, enable Go and/or Python profiling when running a process. Then use tools to analyze the collected profiling data. Finally, optimize code based on the information obtained from the profiling data analysis.

Enabling profiling

Go and Python computational node profiling is enabled by using the --go-profile and --python-profile options on the vor run command. Each flag requires an argument of either cpu or mem to indicate if CPU or memory profiling data is to be collected. Profiling data is written to a profiledata folder in the the run's output directory.

Depending on the flag values passed, the files will have the following form:

  • nodename.gocpuprof - Go CPU profile data
  • nodename.goheapprof - Go memory profile data
  • nodename.pyprof - Python CPU profile data
  • memray-nodename.bin - Python memory profile data

Analyzing Go Profile Data

Analyzing both Go CPU and memory profiling data is easily done using the Go pprof tool.

Note

The pprof tool is installed as part of the Go toolchain and available on all VOR Stream compute nodes.

To view CPU profile data in a terminal, run the following command:

go tool pprof priceall.gocpuprof

The pprof tool can also display the profile data in a web browser. To run the pprof tool in web server mode, run the following command:

go tool pprof -http :4000 priceall.gocpuprof

The web server will be available at http://localhost:4000.

If running the above command from a remote machine, the -http option should include the host name or IP address, or simply 0.0.0.0 to listen on all interfaces.

go tool pprof -http 0.0.0.0:4000 priceall.gocpuprof

If the host that this was run on was 192.168.100.241, the web server will be available at http://192.168.100.241:4000.

For tips on using the pprof tool, see the following resources:

Analyzing Python computational node profile data

Note

You can select either Python CPU profiling or memory profiling for a single process run, but not both.

Visualizing Python CPU Profile Data

Analyzing Python CPU profiling data is easily done using the Python gprof2dot tool.

Note

The gprof2dot tool is installed as part of the VOR Stream Python environment and available on all VOR Stream compute nodes.

The gprof2dot tool generates output in the DOT graph format, which can be converted to various image formats using the dot tool from the Graphviz package (also installed on all VOR Stream compute nodes).

To generate a visual representation of your Python profile data:

gprof2dot -f collapse pyspy-<name of python node>.collapse | dot -Tpng -o output.png

This will create a PNG image file named output.png that you can view using any image viewer.

Note

Error: No child process (os error 10) may show up in logs. This is from py-spy and is safe to ignore.

Visualizing Python Memory Profile Data

Python memory profiling is done using the Memray tool.

Note

The Memray tool is installed as part of the VOR Stream Python environment and available on all VOR Stream compute nodes.

Danger

Since Python nodes are multi-threaded, a separate memory profile is collected for each thread. A memray-nodename.bin.NNNNNN file is the memory profile data for the thread that ran under process ID NNNNNN. The file of the form memray-nodename.bin is for the parent process of the Python node. Generally, you will want to use the one of the memray-nodename.bin.NNNNNN files for your analysis.

Many options are available for visualizing the memory profile data. For example, the Memray tool can be used to generate a summary report in the terminal:

memray summary memray-passthrough.bin.100173

HTML flamegraphs can also be generated:

memray flamegraph memray-passthrough.bin.100173

This will create a memray-flamegraph-passthrough.bin.html file that you can view using any web browser.

For more information on the Memray tool and available reporters, see the Memray documentation.