Adds topdown support for Skylake and Cascade Lake #629

ilumsden · 2024-12-02T18:35:23Z

This PR is a follow up to some discussions had during SC between me and @jessdagostini.

This PR adds topdown support for Intel Skylake (and derivative architectures like Cascade Lake) processors. Support can be enabled by setting -DWITH_ARCH to one of the following:

"skylake"
"skylake_avx512"
"cascadelake"

As with the Sapphire Rapids support, users will not need to set this manually when using Spack.

Currently, all of the level 1 and level 2 metrics are available through the topdown.top and topdown.all runtime configurations. Similar to Sapphire Rapids, users can get the values for the underlying counters used to calculate the topdown metrics using the topdown-counters.top and topdown-counters.all configurations.

jessdagostini · 2024-12-04T22:35:24Z

First of all, thanks for addressing this functionality so quickly @ilumsden!

I have started some tests using Caliper on an Intel Xeon Platinum 8260, which is a Cascade Lake processor. I am trying to collect topdown results from a pangenome application (bioinformatics code).
I have instrumented my code with

int main() {
   read_inputs()

   cali_config_set("CALI_CALIPER_ATTRIBUTE_DEFAULT_SCOPE", "process");
    CALI_MARK_BEGIN("main");
    CALI_MARK_BEGIN("parallel");

    # omp region doall loop parallel dynamic

    CALI_MARK_END("parallel");

    write_extensions();

   CALI_MARK_END("main");
}

Running with CALI_CONFIG='runtime-report' ./exec </path/to/bin/file> </path/to/pangenome/file> <num_threads> I got runtime output

Path       Min time/rank Max time/rank Avg time/rank Time %    
main           11.046940     11.046940     11.046940 13.153955 
  parallel     72.934898     72.934898     72.934898 86.845977

However, when running with
CALI_CONFIG='runtime-report(topdown.all)' ./exec </path/to/bin/file> </path/to/pangenome/file> <num_threads> takes more than 2h (and it didn't finish while I am writing this message) to run the same application with the same inputs. From what I could observe, even setting it to run with 48 threads, the process seems to run sequentially. What intrigues me is that the sequential time for this application and input, without Caliper, is around ~18s. Am I missing something?

UPDATE:
Even running with a smaller input (which takes less than ~2s to run without Caliper topdown), analysis still takes a long time and it's no even finished (running more 10+ minutes now)

ilumsden · 2024-12-04T23:17:50Z

That's really weird. I definitely would not expect this to add that much time. There's one thing I can think of checking in the topdown service, but, in the meantime, can you try running with CALI_CONFIG='runtime-report(topdown-counters.all)', @jessdagostini? I'm wondering if this has something to do with PAPI.

jessdagostini · 2024-12-04T23:27:57Z

@ilumsden Yes! Did that, it finished, and this is the output

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: CALI_CONFIG: error: Unknown option: topdown-counters.all
Finished mapping

I tried to change to CALI_CONFIG='runtime-report(topdown.toplevel)' and this one runs, but it returned

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
Finished mapping
cali-query: TreeFormatter: Attribute "any#any#topdown.retiring" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.backend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.frontend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.bad_speculation" not found.
Path       Min time/rank Max time/rank Avg time/rank Time %   
main            0.000878      0.000878      0.000878 1.620702 
  parallel      0.001954      0.001954      0.001954 3.607063

ilumsden · 2024-12-04T23:51:50Z

Now that's interesting. It looks like there's some issue with recognizing the Caliper option specs for topdown. Let me dig into this.

ilumsden · 2024-12-05T00:16:21Z

@jessdagostini could you try rebuilding and re-running with runtime-report(topdown-counters.all)? When you re-run, please add the following environment variable CALI_LOG_VERBOSITY=3. That should give more debugging output that could give me a better sense of what's going on.

ilumsden · 2024-12-05T00:16:50Z

Also, can you share the output of spack arch?

jessdagostini · 2024-12-05T01:47:41Z

Ok! I re-installed Caliper using Spack. I am testing with the small test case just to check if we get outputs.
Seems that it did not recognize topdown-counter.all again

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: CALI_CONFIG: error: Unknown option: topdown-counters.all
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: Releasing Caliper global data.
  Max active channels: 0
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 30 nodes
   Metadata memory pool: 1 MiB reserved, 18.166 KiB used
  Thread blackboard: max 0 entries (0% occupancy).
== CALIPER: Finished

If I run with topdown.toplevel the output gets bigger

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing
== CALIPER: papi: Found 5 event codes for 1 PAPI component(s)
== CALIPER: papi: Creating eventset with 5 events for component 0 (perf_event)
== CALIPER: papi: Initializing multiplex support for component 0 (perf_event)
== CALIPER: runtime-report: Registered papi service
== CALIPER: runtime-report: Registered topdown service. Level: top.
== CALIPER: Configuration:
CALI_AGGREGATE_KEY=
CALI_CHANNEL_FLUSH_ON_EXIT=false
CALI_CHANNEL_CONFIG_CHECK=true
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_EVENT_EXCLUDE_REGIONS=
CALI_EVENT_INCLUDE_BRANCHES=
CALI_EVENT_INCLUDE_REGIONS=
CALI_EVENT_ENABLE_SNAPSHOT_INFO=false
CALI_EVENT_REGION_LEVEL=0
CALI_EVENT_TRIGGER=
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_MPIREPORT_WRITE_ON_FINALIZE=false
CALI_MPIREPORT_LOCAL_CONFIG= let sum#time.duration=scale(sum#time.duration.ns,1e-9),o_a_v.slot=first(aggregate.slot) select  sum(sum#time.duration),any(topdown.retiring) as "Retiring",any(topdown.backend_bound) as "Backend bound",any(topdown.frontend_bound) as "Frontend bound",any(topdown.bad_speculation) as "Bad speculation" group by path aggregate min(o_a_v.slot) order by min#o_a_v.slot
CALI_MPIREPORT_CONFIG= select  min(sum#sum#time.duration) as "Min time/rank",max(sum#sum#time.duration) as "Max time/rank",avg(sum#sum#time.duration) as "Avg time/rank",percent_total(sum#sum#time.duration)  as "Time %",any(any#topdown.retiring) as "Retiring",any(any#topdown.backend_bound) as "Backend bound",any(any#topdown.frontend_bound) as "Frontend bound",any(any#topdown.bad_speculation) as "Bad speculation" group by path aggregate min(min#o_a_v.slot) order by min#min#o_a_v.slot format tree()
CALI_MPIREPORT_APPEND=true
CALI_MPIREPORT_FILENAME=stderr
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=CPU_CLK_THREAD_UNHALTED:THREAD_P,IDQ_UOPS_NOT_DELIVERED:CORE,INT_MISC:RECOVERY_CYCLES,UOPS_ISSUED:ANY,UOPS_RETIRED:RETIRE_SLOTS
CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timer,topdown
CALI_TIMER_INCLUSIVE_DURATION=false
CALI_TOPDOWN_LEVEL=top
== CALIPER: runtime-report: event: Marked attribute mpi.function
== CALIPER: Registered builtin ConfigManager
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: builtin.configmgr: Flushing Caliper data
== CALIPER: runtime-report: Flushing Caliper data
== CALIPER: runtime-report: Aggregate: flushed 3 snapshots.
cali-query: TreeFormatter: Attribute "any#any#topdown.retiring" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.backend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.frontend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.bad_speculation" not found.
Path       Min time/rank Max time/rank Avg time/rank Time %   
main            0.001979      0.001979      0.001979 3.489282 
  parallel      0.001901      0.001901      0.001901 3.351880 
== CALIPER: CaliperMetadataDB: stored 86 nodes, 59 strings.
== CALIPER: Releasing channel builtin.configmgr
== CALIPER: builtin.configmgr: Finishing mpi service
== CALIPER: Releasing channel runtime-report
== CALIPER: runtime-report: Aggregate: Releasing aggregation DB.
  max hash len: 1, 4 entries, 24 kernels, 1.5 MiB reserved.
== CALIPER: runtime-report: Finishing mpi service
== CALIPER: runtime-report: papi: Finishing
== CALIPER: runtime-report: papi: Created 1 PAPI event set(s) on 1 thread(s).
== CALIPER: papi: Shutdown
== CALIPER: runtime-report: topdown: Computed topdown metrics for 0 records, skipped 3
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      0 computed, 3 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.
== CALIPER: builtin.configmgr channel blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: runtime-report channel blackboard: max 1 entries (0.0979432% occupancy).
== CALIPER: Releasing Caliper global data.
  Max active channels: 1
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 139 nodes
   Metadata memory pool: 1 MiB reserved, 20.1582 KiB used
  Thread blackboard: max 3 entries (0.29383% occupancy).
== CALIPER: Finished

The output of my spack arch is linux-ubuntu22.04-cascadelake

ilumsden · 2024-12-05T01:51:27Z

That helps a lot. It looks like there's an issue when accessing data from the PAPI service. Most likely, I have a typo in one or more of the counter names. I'll look at that.

I'm not sure why topdown-counters.all is not working though.

jessdagostini · 2024-12-05T01:56:53Z

Ok, have an update. I think my example was too small and there were no enough collection.
If I run with the full test case, using topdown.toplevel I got results

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing
== CALIPER: papi: Found 5 event codes for 1 PAPI component(s)
== CALIPER: papi: Creating eventset with 5 events for component 0 (perf_event)
== CALIPER: papi: Initializing multiplex support for component 0 (perf_event)
== CALIPER: runtime-report: Registered papi service
== CALIPER: runtime-report: Registered topdown service. Level: top.
== CALIPER: Configuration:
CALI_AGGREGATE_KEY=
CALI_CHANNEL_FLUSH_ON_EXIT=false
CALI_CHANNEL_CONFIG_CHECK=true
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_EVENT_EXCLUDE_REGIONS=
CALI_EVENT_INCLUDE_BRANCHES=
CALI_EVENT_INCLUDE_REGIONS=
CALI_EVENT_ENABLE_SNAPSHOT_INFO=false
CALI_EVENT_REGION_LEVEL=0
CALI_EVENT_TRIGGER=
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_MPIREPORT_WRITE_ON_FINALIZE=false
CALI_MPIREPORT_LOCAL_CONFIG= let sum#time.duration=scale(sum#time.duration.ns,1e-9),o_a_v.slot=first(aggregate.slot) select  sum(sum#time.duration),any(topdown.retiring) as "Retiring",any(topdown.backend_bound) as "Backend bound",any(topdown.frontend_bound) as "Frontend bound",any(topdown.bad_speculation) as "Bad speculation" group by path aggregate min(o_a_v.slot) order by min#o_a_v.slot
CALI_MPIREPORT_CONFIG= select  min(sum#sum#time.duration) as "Min time/rank",max(sum#sum#time.duration) as "Max time/rank",avg(sum#sum#time.duration) as "Avg time/rank",percent_total(sum#sum#time.duration)  as "Time %",any(any#topdown.retiring) as "Retiring",any(any#topdown.backend_bound) as "Backend bound",any(any#topdown.frontend_bound) as "Frontend bound",any(any#topdown.bad_speculation) as "Bad speculation" group by path aggregate min(min#o_a_v.slot) order by min#min#o_a_v.slot format tree()
CALI_MPIREPORT_APPEND=true
CALI_MPIREPORT_FILENAME=stderr
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=CPU_CLK_THREAD_UNHALTED:THREAD_P,IDQ_UOPS_NOT_DELIVERED:CORE,INT_MISC:RECOVERY_CYCLES,UOPS_ISSUED:ANY,UOPS_RETIRED:RETIRE_SLOTS
CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timer,topdown
CALI_TIMER_INCLUSIVE_DURATION=false
CALI_TOPDOWN_LEVEL=top
== CALIPER: runtime-report: event: Marked attribute mpi.function
== CALIPER: Registered builtin ConfigManager
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: builtin.configmgr: Flushing Caliper data
== CALIPER: runtime-report: Flushing Caliper data
== CALIPER: runtime-report: Aggregate: flushed 3 snapshots.
Path       Min time/rank Max time/rank Avg time/rank Time %    Retiring Backend bound Frontend bound Bad speculation 
main           11.536610     11.536610     11.536610 20.660444 0.540284      0.230441       0.179589        0.049686 
  parallel     44.253396     44.253396     44.253396 79.251601 0.424175      0.235877       0.251795        0.088152 
== CALIPER: CaliperMetadataDB: stored 107 nodes, 75 strings.
== CALIPER: Releasing channel builtin.configmgr
== CALIPER: builtin.configmgr: Finishing mpi service
== CALIPER: Releasing channel runtime-report
== CALIPER: runtime-report: Aggregate: Releasing aggregation DB.
  max hash len: 1, 4 entries, 24 kernels, 1.5 MiB reserved.
== CALIPER: runtime-report: Finishing mpi service
== CALIPER: runtime-report: papi: Finishing
== CALIPER: runtime-report: papi: Created 1 PAPI event set(s) on 1 thread(s).
== CALIPER: papi: Shutdown
== CALIPER: runtime-report: topdown: Computed topdown metrics for 2 records, skipped 1
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      2 computed, 1 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.
== CALIPER: builtin.configmgr channel blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: runtime-report channel blackboard: max 1 entries (0.0979432% occupancy).
== CALIPER: Releasing Caliper global data.
  Max active channels: 1
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 139 nodes
   Metadata memory pool: 1 MiB reserved, 20.1582 KiB used
  Thread blackboard: max 3 entries (0.29383% occupancy).
== CALIPER: Finished

ilumsden · 2024-12-05T02:34:14Z

Interesting. There's still one thing in that output that concerns me.

== CALIPER: runtime-report: topdown: Computed topdown metrics for 2 records, skipped 1
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      2 computed, 1 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.

This output means that it tried to calculate three sets of topdown metrics. Usually, there would be one calculation per Caliper region. However, you only have two regions, so I'm not sure why it tried to do three calculations.

@daboehme, do you have any idea what could be going on with this and the topdown-counters thing?

ilumsden · 2024-12-05T02:36:32Z

@jessdagostini when you get the chance, could you try re-running with topdown.all and CALI_LOG_VERBOSITY=3? I'm still not entirely sure why you were getting such high runtimes with topdown.all, but the extra logging should help figure out the issue. If it's taking longer than ~5 minutes when you run it, just kill the program and share whatever you got from Caliper's logging.

daboehme · 2024-12-05T03:34:45Z

@daboehme, do you have any idea what could be going on with this and the topdown-counters thing?

So the skipped record is probably for the first region begin event in the program, which won't even show up in the profile output. You can safely ignore that.

I took out the topdown-counters option recently since we never used it. You can try it manually with a config like so, substituting the list of counters:

CALI_SERVICES_ENABLE=event,trace,recorder,papi,timer
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=(the list of counters)
CALI_LOG_VERBOSITY=3

The program being stuck is strange indeed. With just two Caliper regions outside of the parallel loop there should essentially be no overhead. Not sure if the PAPI multiplexing does something weird. Though we were able to use it with reasonably long running programs in the past.

ilumsden · 2024-12-05T03:40:10Z

Oh, I didn't realize that about topdown-counters. I can remove those.

As for the hanging/slowdown, PAPI multiplexing overheads were my first thought too. I'm curious about the logging output just to make sure it's nothing else.

@jessdagostini what version of PAPI are you using?

jessdagostini · 2024-12-05T17:43:42Z

I put a topdown.all trial to run yesterday night, before leaving lab. For my surprise, it was still running when I arrived today.

The Caliper logs for this run are

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing

I killed the process now, and no other log were added. Indeed, the last thing here is PAPI multiplexing. The version that was installed with spack dependencies for Caliper is [email protected]. I am sharing here the whole dependencies stack of my installation just in case. All of them are installed using GCC 11.3.0

caliper@topdown_csx
    [email protected]
        [email protected]
            [email protected]
        [email protected]
        [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
    [email protected]
        [email protected]
            [email protected]
                [email protected]
        [email protected]
        [email protected]
            [email protected]
                [email protected]
            [email protected]
        [email protected]
        [email protected]
            [email protected]
        [email protected]
        [email protected]
            [email protected]
                [email protected]
            [email protected]
            [email protected]
        [email protected]
            [email protected]
        [email protected]
        [email protected]
    [email protected]
    [email protected]
        [email protected]
            [email protected]
        [email protected]
            [email protected]
                [email protected]
        [email protected]
        [email protected]
            [email protected]
            [email protected]
                [email protected]
                [email protected]
        [email protected]
        [email protected]
            ca-certificates-mozilla@2023-05-30
        [email protected]
        [email protected]
        [email protected]
        [email protected]

jessdagostini · 2025-01-10T21:49:11Z

Hi @ilumsden
Do you have any updates on this? Is there other information I can share to help? :)

ilumsden · 2025-01-11T17:08:31Z

Hey @jessdagostini. Sorry for the delay. I had to shift my focus towards other things for a bit (especially a time sensitive project where we're trying to ramp up to running on El Cap before it goes into SCF). I should have time to work on Cascade Lake support this week.

One quick thing that I just thought of that you could try is upgrading the version of PAPI. 5.7 is a pretty old version (current is in the 7.X series), so I wonder if there was some sort of bug that got fixed in newer versions.

… out how to access UOPS_RETIRED:MACRO_FUSED

ilumsden · 2025-01-13T21:48:58Z

@jessdagostini I've done a bit more work on this. I can get the current version to run on LLNL's Ruby system using PAPI 6.0.0.1. The values I'm getting are not correct right now, but I may not be running long enough to get valid data from PAPI. I'll need to look into that before I can determine if the invalid values are due to the runtime being too short or due to bad logic.

jessdagostini · 2025-01-13T22:06:35Z

No worries! I was just wondering if we could try to look over it again since it will be nice to collect some data from my application using Caliper 😄
Got it! I will try to reproduce your execution using PAPI 6.0 and will share here! Thanks for updating things here!

ilumsden added 8 commits January 13, 2025 15:24

Adds Skylake/Cascade Lake support to the topdown service

4e3dcf1

Adds topdown config for SKL/CSL to controllers

9f8b3b7

Adds SKL/CSL specs to ConfigManager

f7cdc53

Adds calculations for L2 topdown for Skylake/Cascade Lake

b7e6ac5

Adds some logging and makes some formatting fixes

3ad9039

Tweaks log level

bb4606c

Removes topdown counters options

36bf4ec

Comments out logic for L2 metrics under retiring since I can't figure…

0c9fac1

… out how to access UOPS_RETIRED:MACRO_FUSED

ilumsden force-pushed the topdown-csx branch from 78d9c01 to 0c9fac1 Compare January 13, 2025 20:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds topdown support for Skylake and Cascade Lake #629

Adds topdown support for Skylake and Cascade Lake #629

ilumsden commented Dec 2, 2024

jessdagostini commented Dec 4, 2024 •

edited

Loading

ilumsden commented Dec 4, 2024

jessdagostini commented Dec 4, 2024

ilumsden commented Dec 4, 2024

ilumsden commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024

ilumsden commented Dec 5, 2024 •

edited

Loading

ilumsden commented Dec 5, 2024

daboehme commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024 •

edited

Loading

jessdagostini commented Jan 10, 2025 •

edited

Loading

ilumsden commented Jan 11, 2025

ilumsden commented Jan 13, 2025

jessdagostini commented Jan 13, 2025

Adds topdown support for Skylake and Cascade Lake #629

Are you sure you want to change the base?

Adds topdown support for Skylake and Cascade Lake #629

Conversation

ilumsden commented Dec 2, 2024

jessdagostini commented Dec 4, 2024 • edited Loading

ilumsden commented Dec 4, 2024

jessdagostini commented Dec 4, 2024

ilumsden commented Dec 4, 2024

ilumsden commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024

ilumsden commented Dec 5, 2024 • edited Loading

ilumsden commented Dec 5, 2024

daboehme commented Dec 5, 2024

ilumsden commented Dec 5, 2024

jessdagostini commented Dec 5, 2024 • edited Loading

jessdagostini commented Jan 10, 2025 • edited Loading

ilumsden commented Jan 11, 2025

ilumsden commented Jan 13, 2025

jessdagostini commented Jan 13, 2025

jessdagostini commented Dec 4, 2024 •

edited

Loading

ilumsden commented Dec 5, 2024 •

edited

Loading

jessdagostini commented Dec 5, 2024 •

edited

Loading

jessdagostini commented Jan 10, 2025 •

edited

Loading