-
Notifications
You must be signed in to change notification settings - Fork 218
Performance Analysis
You are here: Home > Developer Documentation > Performance Analysis
This page presents information on using performance analysis tools with PIConGPU.
Update early 2016 (Score-P 1.X): Michael Sippel's Gist
Update 07/2016 (tested with Score-P 2.X):
Score-P is a measurement infrastructure combining several open-source performance analysis tools. It enables to trace and profile massively-parallel applications, including hybrid MPI+CUDA programs.
PIConGPU has cmake support for Score-P. When building and installing the measurement tool, be sure to enable support for MPI and CUDA (and CUPTI).
<user>:<scorep-build-dir>$ ./configure ... --enable-mpi --enable-cuda
When configuring PIConGPU, use the Score-P wrapper scripts for the C++ and NVCC compiler. Example:
# Switch off instrumentation by setting SCOREP_WRAPPER=OFF
<user>:<pic-build-dir>$ SCOREP_WRAPPER=OFF $PICSRC/configure -a sm_35 \
-c "-DCMAKE_CXX_COMPILER=`which scorep-g++` \
-DCUDA_NVCC_EXECUTABLE=`which scorep-nvcc`" \
~/paramSets/case001
# Set instrumentation flags (--user if manual instrumentation is used)
<user>:<pic-build-dir>$ export SCOREP_WRAPPER_INSTRUMENTER_FLAGS="--cuda --mpp=mpi"
<user>:<pic-build-dir>$ make -j
<user>:<pic-build-dir>$ make install
On titan@ORNL please use scorep-CC
instead of scorep-g++
Tracing with OpenMP support:
- Opari backend:
error: ‘_ZTW9pomp_tpd_()’ is not a variable in clause ‘copyin’
- extend
SCOREP_WRAPPER_INSTRUMENTER_FLAGS
with--opari=--omp-tpd:--c++:--omp-tpd-mangling='gnu'
- extend
- if the error
scorep_thread_create_wait_pthread.c:84: Fatal: Bug 'tpd == 0': Invalid Pthread thread specific data object. Please ensure that all pthread_create calls are instrumented.
is triggered try to disable Opari and use:- export
SCOREP_WRAPPER_INSTRUMENTER_FLAGS="--cuda --mpp=mpi --thread=omp:ancestry --nopomp"
- export
Before executing PIConGPU, several Score-P environment variables must be set in your batch environment template script. Some template scripts already provide these environment variables, e.g. titan-ornl/batch_scorep_profile.tpl
. For your own script, set at least the following (buffer sizes may vary):
export SCOREP_ENABLE_TRACING=yes
export SCOREP_CUDA_ENABLE=yes
export SCOREP_CUDA_BUFFER=200M
export SCOREP_TOTAL_MEMORY=1G
export SCOREP_FILTERING_FILE=!TBG_dstPath/tbg/scorep.filter
When successfull, a new directory called scorep-*
is created which contains the trace file trace.otf2
. The trace can than be visualized with Vampir.
Further information:
All wiki entries describe the dev branch. Features may be different in the current master branch.
Before you start please read our README!
PIConGPU is a scientific project. If you present and/or publish scientific results that used PIConGPU, you should set a reference to show your support. Our according up-to-date publication at the time of your publication should be inquired from:
The documentation in this wiki is still not complete and we need your help keeping it up to date. Feel free to help improving this wiki!