Skip to content

Releases: ROCm/omnitrace

v1.2.0: Auto-generate configs, function args in Perfetto

30 Jun 02:24
f828453
Compare
Choose a tag to compare

Notable Changes

General

  • Rework submodule installation by @jrmadsen in #70
    • This ensures that any/all vendored 3rd-party libraries are installed to an omnitrace subfolder in the lib directory, i.e. <prefix>/lib/omnitrace/

Bug Fixes

  • Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #51
  • Fixes OMNITRACE_SUPPRESS_CONFIG handling by @jrmadsen in #53
  • Fix attaching to running process, i.e. omnitrace -p by @jrmadsen in #60

Enhancements

  • omnitrace-avail generate config by @jrmadsen in #69
  • tracing NS + category region component + MPI args by @jrmadsen in #52
  • HIP API args in perfetto + new perfetto categories by @jrmadsen in #76

Deprecations

  • Rename OMNITRACE_ROCM_SMI_DEVICES to OMNITRACE_SAMPLING_GPUS by @jrmadsen in #58
  • Rename OMNITRACE_USE_THREAD_SAMPLING to OMNITRACE_USE_PROCESS_SAMPLING by @jrmadsen in #68

What's Changed

  • Fixes the configuration file example by @jrmadsen in #45
  • CI for OpenSUSE by @jrmadsen in #12
  • Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #50
  • Fixes excluded-instr output, fini functions, tweaks MPI by @jrmadsen in #51
  • Define new function attributes by @jrmadsen in #55
  • Inclusive range for OMNITRACE_SAMPLING_CPUS by @jrmadsen in #54
  • Fixes OMNITRACE_SUPPRESS_CONFIG handling by @jrmadsen in #53
  • Remove reliance on MPI_Comm_rank by @jrmadsen in #56
  • Fix find_path in omnitrace-dl by @jrmadsen in #59
  • Improved the determination of MPI rank by @jrmadsen in #61
  • Fix attaching to running process, i.e. omnitrace -p by @jrmadsen in #60
  • Rename OMNITRACE_ROCM_SMI_DEVICES to OMNITRACE_SAMPLING_GPUS by @jrmadsen in #58
  • Update PTL submodule by @jrmadsen in #63
  • libomnitrace uses common headers by @jrmadsen in #62
  • Update timemory submodule by @jrmadsen in #64
  • Update dyninst submodule by @jrmadsen in #65
  • adding perfetto-validation-script by @ratamima in #66
  • Rename OMNITRACE_USE_THREAD_SAMPLING to OMNITRACE_USE_PROCESS_SAMPLING by @jrmadsen in #68
  • tracing NS + category region component + MPI args by @jrmadsen in #52
  • Fix PID resolution + OMNITRACE_VERSION + fix various configs by @jrmadsen in #71
  • omnitrace-avail generate config by @jrmadsen in #69
  • Rework submodule installation by @jrmadsen in #70
  • Fix docs using -D instead of -G by @jrmadsen in #73
  • Adds test which validates errors for missing configs by @jrmadsen in #75
  • Use concurrency in GitHub Actions + remove cancelling by @jrmadsen in #77
  • HIP API args in perfetto + new perfetto categories by @jrmadsen in #76
  • Handle OMNITRACE_ENABLED + minor updates by @jrmadsen in #78

New Contributors

Full Changelog: v1.1.1...v1.2.0

Instructions for installing binary releases

See the documentation here to determine whether your OS supports installation via the pre-built installation scripts. It is possible to use these scripts on similar Linux flavors, e.g. the Ubuntu 20.04 (focal fossa) installer is compatible with Debian 11 (bullseye/sid).

  1. Download the binary for your OS and with the desired ROCm compatibility (if any)
    • The supported Python versions for all installers in this release are 3.6, 3.7, 3.8, 3.9, and 3.10
  2. Install the dependencies as needed (if not already installed)
    • All binary installers require installing OpenMP, e.g., apt-get install libgomp1 on Ubuntu
    • Packages with ROCm in the name require installing ROCm: instructions can be found here
    • PAPI and OMPT support have no runtime dependencies and thus require no additional installations.
    • All installations have partial MPI support
    • If you do not have Python installed on your system and do not intend to use the Python capabilities, installing Python is not necessary.
  3. Create the installation directory for omnitrace, e.g. mkdir /opt/omnitrace
  4. Run the installer script (see example below)
    • Recommendation: use --exclude-subdir option
  5. Setup the environment via setup-env.sh or environment-modules
    a. Source the setup-env.sh script in <prefix>/share/omnitrace, e.g. source /opt/omnitrace/share/omnitrace/setup-env.sh
    b. module use <prefix>/share/modulefiles and module load omnitrace/1.2.0
  6. Verify which omnitrace and which omnitrace-avail return <prefix>/bin/omnitrace and <prefix>/bin/omnitrace-avail

Example for omnitrace-1.2.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh

$ mkdir ${HOME}/omnitrace

$ ./omnitrace-1.2.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh --prefix=/opt/omnitrace --skip-license --exclude-subdir
omnitrace Installer Version: 1.2.0, Copyright (c) Advanced Micro Devices, Inc.
This is a self-extracting archive.
The archive will be extracted to: /opt/omnitrace

Using target directory: /opt/omnitrace
Extracting, please wait...

Unpacking finished successfully

$ source /opt/omnitrace/share/omnitrace/setup-env.sh

$ which omnitrace
/opt/omnitrace/bin/omnitrace

$ which omnitrace-avail
/opt/omnitrace/bin/omnitrace-avail

Enabling CPU Hardware Counters

In order to enable collecting CPU hardware counters, the value of /proc/sys/kernel/perf_event_paranoid may need to be changed.
The default value is 2. To update /proc/sys/kernel/perf_event_paranoid run:
echo <VALUE> | sudo tee /proc/sys/kernel/perf_event_paranoid

Value CPU Hardware Counter Capabilities
-1 Allow use of (almost) all events by all users. Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>=0 Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN. Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>=1 Disallow CPU event access by users without CAP_SYS_ADMIN
>=2 Disallow kernel profiling by users without CAP_SYS_ADMIN

v1.1.1: KokkosP and timemory fixes

14 Jun 17:21
a142b20
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.1.0...v1.1.1

v1.1.0: Bug Fixes, Strict Configs, Perfetto Label Overhaul

10 Jun 22:20
d4b8e25
Compare
Choose a tag to compare

What's Changed

  • Standalone build examples + testing workflow updates by @jrmadsen in #15
  • CMake updates/fixes + parallel-overhead updates by @jrmadsen in #16
  • New documentation page with youtube links to tutorials by @jrmadsen in #23
  • Update setup-env.sh and modulefile by @jrmadsen in #26
  • Fix perfetto_counter_track string lifetime by @jrmadsen in #28
  • Set OMNITRACE_USE_THREAD_SAMPLING=ON for several tests by @jrmadsen in #29
  • Fix category regex + new features by @jrmadsen in #25
  • Support strict settings option in timemory + expanded config syntax by @jrmadsen in #31
  • Rework sampling trace counter names + new trace counters by @jrmadsen in #30
  • Fix loop-level instrumentation + more by @jrmadsen in #32
  • Fix sampling counter time scales by @jrmadsen in #33
  • Implements --label option for python profiler by @jrmadsen in #34
  • export libomnitrace-dl.so to OMP_TOOL_LIBRARIES by @jrmadsen in #27

Full Changelog: v1.0.0...v1.1.0

v1.0.0: Initial Release of Omnitrace

31 May 03:31
ce29187
Compare
Choose a tag to compare

Overview

  • Please refer to the documentation about the capabilities of omnitrace.
  • The binary installers attached to this release only have external dependencies on the ROCm version noted in the script name, support for PAPI, OMPT, etc. are built-in

Full Changelog: v0.0.1...v1.0.0

Instructions for installing binary releases

See the documentation here to determine whether your OS supports installation via the pre-built installation scripts. It is possible to use these scripts on similar Linux flavors, e.g. the Ubuntu 20.04 (focal fossa) installer is compatible with Debian 11 (bullseye/sid).

  1. Download the binary for your OS and with the desired ROCm compatibility (if any)
    • The supported Python versions for all installers in this release are 3.6, 3.7, 3.8, 3.9, and 3.10
  2. Install the dependencies as needed (if not already installed)
    • All binary installers require installing OpenMP, e.g., apt-get install libgomp1 on Ubuntu
    • Packages with ROCm in the name require installing ROCm: instructions can be found here
    • PAPI and OMPT support have no runtime dependencies and thus require no additional installations.
    • All installations have partial MPI support
    • If you do not have Python installed on your system and do not intend to use the Python capabilities, installing Python is not necessary.
  3. Create the installation directory for omnitrace, e.g. mkdir /opt/omnitrace
  4. Run the installer script (see example below)
    • Recommendation: use --exclude-subdir option
  5. Setup the environment via setup-env.sh or environment-modules
    a. Source the setup-env.sh script in <prefix>/share/omnitrace, e.g. source /opt/omnitrace/share/omnitrace/setup-env.sh
    b. module use <prefix>/share/modulefiles and module load omnitrace/1.0.0
  6. Verify which omnitrace and which omnitrace-avail return <prefix>/bin/omnitrace and <prefix>/bin/omnitrace-avail

Example for omnitrace-1.0.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh

$ mkdir ${HOME}/omnitrace

$ ./omnitrace-1.0.0-ubuntu-20.04-ROCm-50000-PAPI-OMPT-Python3.sh --prefix=/opt/omnitrace --skip-license --exclude-subdir
omnitrace Installer Version: 1.0.0, Copyright (c) Advanced Micro Devices, Inc.
This is a self-extracting archive.
The archive will be extracted to: /opt/omnitrace

Using target directory: /opt/omnitrace
Extracting, please wait...

Unpacking finished successfully

$ source /opt/omnitrace/share/omnitrace/setup-env.sh

$ which omnitrace
/opt/omnitrace/bin/omnitrace

$ which omnitrace-avail
/opt/omnitrace/bin/omnitrace-avail

Enabling CPU Hardware Counters

In order to enable collecting CPU hardware counters, the value of /proc/sys/kernel/perf_event_paranoid may need to be changed.
The default value is 2. To update /proc/sys/kernel/perf_event_paranoid run:
echo <VALUE> | sudo tee /proc/sys/kernel/perf_event_paranoid

Value CPU Hardware Counter Capabilities
-1 Allow use of (almost) all events by all users. Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK
>=0 Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN. Disallow raw tracepoint access by users without CAP_SYS_ADMIN
>=1 Disallow CPU event access by users without CAP_SYS_ADMIN
>=2 Disallow kernel profiling by users without CAP_SYS_ADMIN