Skip to content

PAPI Statistical Profiling

Treece-Burgess edited this page Jan 31, 2024 · 14 revisions

Statistical Profiling

Statistical Profiling involves periodically interrupting a running program and examining the program counter at the time of the interruption. If this is done for a reasonable number of interrupting intervals, the resulting program counter distribution will be statistically representative of the execution profile of the program with respect to the interrupting event. Performance tools like UNIX prof, sample the program address with respect to time and hash the value into a histogram. At program completion, the histogram is analyzed and associated with symbolic information contained in the executable. GNU prof in conjunction with the –p option of the GCC compiler performs exactly this analysis using the process time as the interrupting trigger. PAPI aims to generalize this functionality so that a histogram can be generated using any countable hardware event as the basis for the interrupt signal.

Generating A PC Histogram

A PC histogram can be generated on any countable event by calling either of the following low-level functions:

C:

void *buf;
unsigned bufsiz;
vptr_t offset;
unsigned scale;
int EventSet, EventCode, threshold, flags;
int retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags);

Arguments for PAPI_profil:

  • buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'.
  • bufsiz -- the size of the histogram buffer in bytes.
  • offset -- the start address of the region to be profiled.
  • scale -- a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled.
  • EventSet -- the PAPI EventSet to profile when it is started.
  • EventCode -- code of the Event in the EventSet to profile.
  • threshold -- minimum number of events that must occur before the PC is sampled. If hardware overflow is supported for your component, this threshold will trigger an interrupt when reached.
  • flags -- bit pattern to control profiling behavior.
PAPI_sprofil_t prof;
int profcnt, EventSet, EventCode, threshold, flags;
int retval = PAPI_sprofil(&prof, profcnt, EventSet, EventCode, threshold, flags);

Arguments for PAPI_sprofil:

  • prof -- pointer to an array of PAPI_sprofil_t structures. Each copy of the structure contains the following:
    • buf -- pointer to a buffer of bufsiz bytes in which the histogram counts are stored in an array of unsigned short, unsigned int, or unsigned long long values, or 'buckets'. The size of the buckets is determined by values in the flags argument.
    • bufsiz -- the size of the histogram buffer in bytes. It is computed from the length of the code region to be profiled, the size of the buckets, and the scale factor as discussed below.
    • offset -- the start address of the region to be profiled.
    • scale -- a contraction factor that indicates how much smaller the histogram buffer is than the region to be profiled.
  • profcnt -- number of structures in the prof array for hardware profiling.
  • EventSet -- the PAPI EventSet to profile when it is started.
  • EventCode -- code of the Event in the EventSet to profile.
  • threshold -- threshold value for the Event triggers the handler.
  • flags -- bit pattern to control profiling behavior.

Note: The profiling routines have no Fortran interface.

The defined bit values for the flags variable are shown in the table below:

Defined bit Description
PAPI_PROFIL_POSIX Default type of profiling.
PAPI_PROFIL_RANDOM Drop a random 25% of the samples.
PAPI_PROFIL_WEIGHTED Weight the samples by their value.
PAPI_PROFIL_COMPRESS Ignore samples if hash buckets get big.
PAPI_PROFIL_BUCKET_16 Save samples in 16-bit hash buckets.
PAPI_PROFIL_BUCKET_32 Save samples in 32-bit hash buckets.
PAPI_PROFIL_BUCKET_64 Save samples in 64-bit hash buckets.
PAPI_PROFIL_FORCE_SW Force software overflow in profiling.
PAPI_PROFIL_DATA_EAR Use data address register profiling.
PAPI_PROFIL_INST_EAR Use instruction address register profiling.
PAPI_PROFIL_BUCKETS PAPI_PROFIL_BUCKET_16 or PAPI_PROFIL_BUCKET_32 or PAPI_PROFIL_BUCKET_64.

PAPI_profil creates a histogram of overflow counts for a specified region of the application code by using its first four parameters to create the data structures needed by PAPI_sprofil and then calls PAPI_sprofil to do the work. PAPI_sprofil assumes a pre-initialized PAPI_sprofil_t structure and enables profiling for the EventSet based on its value. Note that the EventSet must be in the stopped state in order for either call to succeed. More than one hardware event can be profiled at the same time by making multiple independent calls to these functions for the same EventSet before calling PAPI_start. This can be useful for the simultaneous generation of profiles of two or more related events, for example L1 cache misses and L2 cache misses. Profiling can be turned off for specific events by calling the function for that event with a threshold of zero. On success, these functions return PAPI_OK and on error, a non-zero error code is returned. For more code examples, see profile.c, profile_twoevents.c or sprofile.c in the ctests directory of the PAPI source distribution.

For a more extensive description of the parameters in the PAPI_profil call, see the PAPI_profil man page.

In the following code example, PAPI_profil is used to generate a PC histogram:

#include <papi.h>
#include <stdio.h>
#include <stdlib.h> 
#include <string.h>

void handle_error (int retval)
{
    printf("PAPI error %d: %s\n", retval, PAPI_strerror(retval));
    exit(1);
}

int main()
{
    int retval;
    void *buf;
    unsigned bufsiz;
    vptr_t offset;
    unsigned scale = 65536;
    int EventSet = PAPI_NULL, EventCode, threshold = 100000, flags = PAPI_PROFIL_POSIX | PAPI_PROFIL_BUCKET_16;
    const PAPI_exe_info_t *prginfo = NULL;
    long long values[2];

    /* Initialize the library */
    retval = PAPI_library_init(PAPI_VER_CURRENT);
    if (retval != PAPI_VER_CURRENT)
        handle_error(retval);
    
    /* Obtaining address space info */
    prginfo = PAPI_get_executable_info();
    if (prginfo == NULL) {
        handle_error(1); 
    }
    
    offset = prginfo->address_info.text_start;
    bufsiz = prginfo->address_info.text_end - prginfo->address_info.text_start;

    buf = (unsigned short *)malloc(bufsiz);
    if (buf == NULL) {
        handle_error(1);
    }
    memset(buf, 0x00, bufsiz);

    /* Creating an EventSet */
    retval = PAPI_create_eventset(&EventSet);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Adding events to the EventSet*/
    EventCode = PAPI_TOT_INS;
    retval = PAPI_add_event(EventSet, EventCode);
    if (retval != PAPI_OK)
        handle_error(retval);

    retval = PAPI_add_event(EventSet, PAPI_FP_OPS);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Enable the collection of profiling information */
    retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, threshold, flags);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Start counting events in an EventSet */
    retval = PAPI_start(EventSet);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Code to monitor */
    
    /* Stop counting events in an EventSet */
    retval = PAPI_stop(EventSet, values);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Disable collection of profiling information by setting threshold to 0 */
    retval = PAPI_profil(buf, bufsiz, offset, scale, EventSet, EventCode, 0, flags);
    if (retval != PAPI_OK)
        handle_error(retval);

    /* Formatting output */
    printf("------------------------------------------\n");
    printf("Test type   : \tPAPI_PROFIL_POSIX\n");
    printf("------------------------------------------\n\n\n");  
    printf("PAPI_profil() hash table.\n");
    printf("address\t\tflat   \n");

    /* Output */
    for (int i = 0; i < (int) bufsiz/2; i++) {
        if ( ((unsigned long*)buf)[i]) {
            printf("%#lx\t%d \n",
                  (unsigned long) offset + (unsigned long) (2 * i),  ((unsigned long*)buf)[i]);
        }
    }

    /* Executes if all low-level PAPI
    function calls returned PAPI_OK */
    PAPI_shutdown();
    printf("\033[0;32m\n\nPASSED\n\033[0m");
    exit(0); 
}

Possible Output

------------------------------------------
Test type   :   PAPI_PROFIL_POSIX
------------------------------------------


PAPI_profil() hash table.
address         flat   
0x401402        257 
0x401404        8296176 
0x401408        1 
0x40140a        8427584 
0x40140c        8344544
.
.
0x40148c        4 
0x40148e        3843 
0x401496        1048672 


PASSED

On success, all PAPI functions return PAPI_OK and the possible above output is returned. On error, a non-zero error code is returned.

Clone this wiki locally