Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PAPI_ipc fails with Event does not exist on 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz #126

Open
milianw opened this issue Nov 26, 2023 · 11 comments

Comments

@milianw
Copy link

milianw commented Nov 26, 2023

A simple call to PAPI_ipc fails on my Thinkpad P1 Gen4 laptop, whereas this used to work fine on my older laptop and also on my workstation with an AMD Cpu. It is not yet a hybrid intel CPU, and I have elevated my perf privileges (perf stat works just fine).

#include <papi.h>

#include <iostream>

struct Ipc
{
    static Ipc measure()
    {
        Ipc data;
        int ret = PAPI_ipc(&data.realTime, &data.processTime,
                           &data.instructions, &data.ipc);
        if (ret != 0) {
            std::cerr << "IPC measurement failed with code " << ret << ": "
                      << PAPI_strerror(ret) << std::endl;
        }

        return data;
    }

    void print(const char* label) const
    {
        std::cout << label
                  << "\n\trealtime elapsed: " << realTime
                  << ", process time elapsed: " << processTime
                  << "\n\tinstructions executed: " << instructions
                  << ", cycles: " << (instructions / ipc)
                  << ", IPC: " << ipc
                  << "\n";
    }

    float realTime = 0;
    float processTime = 0;
    long long instructions = 0;
    float ipc = 0;
};

int main()
{
    Ipc::measure().print("test");
    return 0;
}

Compiled with:

$ g++ -g -O2 test.cpp -lpapi -o test_papi
$ ./test_papi 
IPC measurement failed with code -7: Event does not exist
test
        realtime elapsed: 0, process time elapsed: 0
        instructions executed: 0, cycles: -nan, IPC: 0

With strace I can see:

perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_INSTRUCTIONS, sample_period=0, sample_type=0, read_format=0, precise_ip=0 /* arbitrary skid */, ...}, 0, -1, -1, 0) = 3
close(3)                                = 0
perf_event_open({type=PERF_TYPE_HARDWARE, size=0 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_INSTRUCTIONS, sample_period=0, sample_type=0, read_format=0, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 0, -1, -1, 0) = 3
close(3)                                = 0

When I instead run strace perf stat -e instructions on some binary I see:

perf_event_open({type=PERF_TYPE_HARDWARE, size=0x88 /* PERF_ATTR_SIZE_??? */, config=PERF_COUNT_HW_INSTRUCTIONS, sample_period=0, sample_type=PERF_SAMPLE_IDENTIFIER, read_format=PERF_FORMAT_TOTAL_TIME_ENABLED|PERF_FORMAT_TOTAL_TIME_RUNNING, disabled=1, inherit=1, enable_on_exec=1, precise_ip=0 /* arbitrary skid */, exclude_guest=1, ...}, 4762, -1, -1, PERF_FLAG_FD_CLOEXEC) = 3
...
read(3, "\202T \0\0\0\0\0\250\306\6\0\0\0\0\0\250\306\6\0\0\0\0\0", 24) = 24
close(3)                                = 0

My system:

inxi -GSC -xx
System:
  Host: agathemoarbauer Kernel: 6.6.2-arch1-1 arch: x86_64 bits: 64
    compiler: gcc v: 13.2.1 Desktop: KDE Plasma v: 5.27.9 tk: Qt v: 5.15.11
    wm: kwin_x11 dm: SDDM Distro: Arch Linux
CPU:
  Info: 8-core model: 11th Gen Intel Core i7-11850H bits: 64 type: MT MCP
    arch: Tiger Lake rev: 1 cache: L1: 640 KiB L2: 10 MiB L3: 24 MiB
  Speed (MHz): avg: 969 high: 3506 min/max: 800/4800 cores: 1: 800 2: 800
    3: 800 4: 800 5: 800 6: 800 7: 800 8: 3506 9: 800 10: 800 11: 800 12: 800
    13: 800 14: 800 15: 800 16: 800 bogomips: 79888
  Flags: avx avx2 ht lm nx pae sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx
@gcongiu
Copy link
Contributor

gcongiu commented Nov 27, 2023 via email

@gcongiu
Copy link
Contributor

gcongiu commented Nov 27, 2023 via email

@milianw
Copy link
Author

milianw commented Nov 29, 2023

Reverting the patch doesn't seem to help.

Here's the output before elevating my privileges:
papi_avail.txt
papi_component_avail.txt
papi_native_avail.txt

Here's the output after elevating my privileges:
papi_avail.2.txt
papi_component_avail.2.txt
papi_native_avail.2.txt

@milianw
Copy link
Author

milianw commented Nov 29, 2023

For good measure, I'm also attaching the output of perf list which shows a ton of stuff is actually available on my system (and it works when I use e.g. perf stat or perf record).
perf.list.txt

@gcongiu
Copy link
Contributor

gcongiu commented Nov 29, 2023

Your CPU's microarchitecture is Tiger Lake, which is unsupported by PAPI preset events. That is the reason your test is not working as expected. PAPI_ipc uses the PAPI_TOT_INS and PAPI_TOT_CYC preset events.

@adanalis
Copy link
Contributor

adanalis commented Nov 29, 2023 via email

@milianw
Copy link
Author

milianw commented Nov 30, 2023

I am willing to contribute back if that helps the project.

But quite frankly I'm pretty surprised by all this - my assumption was that PAPI_ipc uses the same API as perf stat internally, and would thus simply request cycles and instructions - i.e. PERF_COUNT_HW_CPU_CYCLES and PERF_COUNT_HW_INSTRUCTIONS. Why is that not done?

Then, assuming there's a good reason for doing that for the existing covered platforms in PAPI - couldn't you add a generic fallback using these generic counters?

@adanalis
Copy link
Contributor

adanalis commented Nov 30, 2023 via email

@milianw
Copy link
Author

milianw commented Nov 30, 2023

Can you please elaborate on that? For the sake of PAPI_ipc - what testing do you need, or what are you measuring if not the same as perf stat -e cycles,instructions. If that does something wrong, then it would be a kernel bug, no? Can you not rely on the kernel to give you correct data?

@adanalis
Copy link
Contributor

adanalis commented Nov 30, 2023 via email

@milianw
Copy link
Author

milianw commented Nov 30, 2023

I have never come across such situations, short of platforms with broken PMUs (like iMX6). It would be really good if PAPI_ipc and similar "simple" API would work as long as we can find suitable events in e.g. /sys/bus/event_source/devices/cpu/events

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants