-
Notifications
You must be signed in to change notification settings - Fork 51
PAPI Statistical Profiling
Statistical Profiling involves periodically interrupting a running program and examining the program counter at the time of the interruption. If this is done for a reasonable number of interrupting intervals, the resulting program counter distribution will be statistically representative of the execution profile of the program with respect to the interrupting event. Performance tools like UNIX prof, sample the program address with respect to time and hash the value into a histogram. At program completion, the histogram is analyzed and associated with symbolic information contained in the executable. GNU prof in conjunction with the –p option of the GCC compiler performs exactly this analysis using the process time as the interrupting trigger. PAPI aims to generalize this functionality so that a histogram can be generated using any countable hardware event as the basis for the interrupt signal.
A PC histogram can be generated on any countable event by calling either of the following low-level functions:
void *buf;
unsigned bufsiz;
vptr_t offset;
unsigned scale;
int EventSet, EventCode, threshold, flags;
int retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags);
Arguments for PAPI_profil
:
- buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'.
- bufsiz -- the size of the histogram buffer in bytes.
- offset -- the start address of the region to be profiled.
- scale -- a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled.
- EventSet -- the PAPI EventSet to profile when it is started.
- EventCode -- code of the Event in the EventSet to profile.
- threshold -- minimum number of events that must occur before the PC is sampled. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached.
- flags -- bit pattern to control profiling behavior.
PAPI_sprofil_t prof;
int profcnt, EventSet, EventCode, threshold, flags;
int retval = PAPI_sprofil(&prof, profcnt, EventSet, EventCode, threshold, flags);
Arguments for PAPI_sprofil
:
-
prof -- pointer to an array of PAPI_sprofil_t structures. Each copy of the structure contains the following:
- buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'. The size of the buckets is determined by values in the flags argument.
- bufsiz -- the size of the histogram buffer in bytes. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed below.
- offset -- the start address of the region to be profiled.
- scale -- a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled.
- profcnt -- number of structures in the prof array for hardware profiling.
- EventSet -- the PAPI EventSet to profile when it is started.
- EventCode -- code of the Event in the EventSet to profile.
- threshold -- threshold value for the Event triggers the handler.
- flags -- bit pattern to control profiling behavior.
Note: The profiling routines have no Fortran interface.
The defined bit values for the flags variable are shown in the table below:
Defined bit | Description |
---|---|
PAPI_PROFIL_POSIX | Default type of profiling. |
PAPI_PROFIL_RANDOM | Drop a random 25% of the samples. |
PAPI_PROFIL_WEIGHTED | Weight the samples by their value. |
PAPI_PROFIL_COMPRESS | Ignore samples if hash buckets get big. |
PAPI_PROFIL_BUCKET_16 | Save samples in 16-bit hash buckets. |
PAPI_PROFIL_BUCKET_32 | Save samples in 32-bit hash buckets. |
PAPI_PROFIL_BUCKET_64 | Save samples in 64-bit hash buckets. |
PAPI_PROFIL_FORCE_SW | Force software overflow in profiling. |
PAPI_PROFIL_DATA_EAR | Use data address register profiling. |
PAPI_PROFIL_INST_EAR | Use instruction address register profiling. |
PAPI_PROFIL_BUCKETS | PAPI_PROFIL_BUCKET_16 or PAPI_PROFIL_BUCKET_32 or PAPI_PROFIL_BUCKET_64. |
PAPI_profil creates a histogram of overflow counts for a specified region of the application code by using its first four parameters to create the data structures needed by PAPI_sprofil and then calls PAPI_sprofil to do the work. PAPI_sprofil assumes a pre-initialized PAPI_sprofil_t structure and enables profiling for the EventSet based on its value. Note that the EventSet must be in the stopped state in order for either call to succeed. More than one hardware event can be profiled at the same time by making multiple independent calls to these functions for the same EventSet before calling PAPI_start. This can be useful for the simultaneous generation of profiles of two or more related events, for example L1 cache misses and L2 cache misses. Profiling can be turned off for specific events by calling the function for that event with a threshold of zero. On success, these functions return PAPI_OK and on error, a non-zero error code is returned. For more code examples, see profile.c, profile_twoevents.c or sprofile.c in the ctests directory of the PAPI source distribution.
For a more extensive description of the parameters in the PAPI_profil call, see the PAPI_profil man page.
In the following code example, PAPI_profil is used to generate a PC histogram:
#include <papi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void handle_error (int retval)
{
printf("PAPI error %d: %s\n", retval, PAPI_strerror(retval));
exit(1);
}
int main()
{
int retval;
void *buf;
unsigned bufsiz;
vptr_t offset;
unsigned scale = 65536;
int EventSet = PAPI_NULL, EventCode, threshold = 100000, flags = PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16;
const PAPI_exe_info_t *prginfo = NULL;
long long values[2];
/* Initialize the library */
retval = PAPI_library_init(PAPI_VER_CURRENT);
if (retval != PAPI_VER_CURRENT)
handle_error(retval);
/* Obtaining address space info */
prginfo = PAPI_get_executable_info();
if (prginfo == NULL) {
handle_error(1);
}
offset = prginfo->address_info.text_start;
bufsiz = prginfo->address_info.text_end - prginfo->address_info.text_start;
buf = (unsigned short *)malloc(bufsiz);
if (buf == NULL) {
handle_error(1);
}
memset(buf, 0x00, bufsiz);
/* Creating an EventSet */
retval = PAPI_create_eventset(&EventSet);
if (retval != PAPI_OK)
handle_error(retval);
/* Adding events to the EventSet*/
EventCode = PAPI_TOT_INS;
retval = PAPI_add_event(EventSet, EventCode);
if (retval != PAPI_OK)
handle_error(retval);
retval = PAPI_add_event(EventSet, PAPI_FP_OPS);
if (retval != PAPI_OK)
handle_error(retval);
/* Enable the collection of profiling information */
retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags);
if (retval != PAPI_OK)
handle_error(retval);
/* Start counting events in an EventSet */
retval = PAPI_start(EventSet);
if (retval != PAPI_OK)
handle_error(retval);
/* Code to monitor */
/* Stop counting events in an EventSet */
retval = PAPI_stop(EventSet, values);
if (retval != PAPI_OK)
handle_error(retval);
/* Disable collection of profiling information by setting threshold to 0 */
retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, 0, flags);
if (retval != PAPI_OK)
handle_error(retval);
/* Formatting output */
printf("------------------------------------------\n");
printf("Test type : \tPAPI_PROFIL_POSIX\n");
printf("------------------------------------------\n\n\n");
printf("PAPI_profil() hash table.\n");
printf("address\t\tflat \n");
/* Output */
for (int i = 0; i < (int) bufsiz/2; i++) {
if ( ((unsigned long*)buf)[i]) {
printf("%#lx\t%d \n",
(unsigned long) offset + (unsigned long) (2 * i), ((unsigned long*)buf)[i]);
}
}
/* Executes if all low-level PAPI
function calls returned PAPI_OK */
PAPI_shutdown();
printf("\033[0;32m\n\nPASSED\n\033[0m");
exit(0);
}
------------------------------------------
Test type : PAPI_PROFIL_POSIX
------------------------------------------
PAPI_profil() hash table.
address flat
0x401402 257
0x401404 8296176
0x401408 1
0x40140a 8427584
0x40140c 8344544
.
.
0x40148c 4
0x40148e 3843
0x401496 1048672
PASSED
On success, all PAPI functions return PAPI_OK and the possible above output is returned. On error, a non-zero error code is returned.