Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds topdown support for Skylake and Cascade Lake #629

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

ilumsden
Copy link
Contributor

@ilumsden ilumsden commented Dec 2, 2024

This PR is a follow up to some discussions had during SC between me and @jessdagostini.

This PR adds topdown support for Intel Skylake (and derivative architectures like Cascade Lake) processors. Support can be enabled by setting -DWITH_ARCH to one of the following:

  • "skylake"
  • "skylake_avx512"
  • "cascadelake"

As with the Sapphire Rapids support, users will not need to set this manually when using Spack.

Currently, all of the level 1 and level 2 metrics are available through the topdown.top and topdown.all runtime configurations. Similar to Sapphire Rapids, users can get the values for the underlying counters used to calculate the topdown metrics using the topdown-counters.top and topdown-counters.all configurations.

@jessdagostini
Copy link

jessdagostini commented Dec 4, 2024

First of all, thanks for addressing this functionality so quickly @ilumsden!

I have started some tests using Caliper on an Intel Xeon Platinum 8260, which is a Cascade Lake processor. I am trying to collect topdown results from a pangenome application (bioinformatics code).
I have instrumented my code with

int main() {
   read_inputs()

   cali_config_set("CALI_CALIPER_ATTRIBUTE_DEFAULT_SCOPE", "process");
    CALI_MARK_BEGIN("main");
    CALI_MARK_BEGIN("parallel");

    # omp region doall loop parallel dynamic

    CALI_MARK_END("parallel");

    write_extensions();

   CALI_MARK_END("main");
}

Running with CALI_CONFIG='runtime-report' ./exec </path/to/bin/file> </path/to/pangenome/file> <num_threads> I got runtime output

Path       Min time/rank Max time/rank Avg time/rank Time %    
main           11.046940     11.046940     11.046940 13.153955 
  parallel     72.934898     72.934898     72.934898 86.845977

However, when running with
CALI_CONFIG='runtime-report(topdown.all)' ./exec </path/to/bin/file> </path/to/pangenome/file> <num_threads> takes more than 2h (and it didn't finish while I am writing this message) to run the same application with the same inputs. From what I could observe, even setting it to run with 48 threads, the process seems to run sequentially. What intrigues me is that the sequential time for this application and input, without Caliper, is around ~18s. Am I missing something?

UPDATE:
Even running with a smaller input (which takes less than ~2s to run without Caliper topdown), analysis still takes a long time and it's no even finished (running more 10+ minutes now)

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 4, 2024

That's really weird. I definitely would not expect this to add that much time. There's one thing I can think of checking in the topdown service, but, in the meantime, can you try running with CALI_CONFIG='runtime-report(topdown-counters.all)', @jessdagostini? I'm wondering if this has something to do with PAPI.

@jessdagostini
Copy link

@ilumsden Yes! Did that, it finished, and this is the output

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: CALI_CONFIG: error: Unknown option: topdown-counters.all
Finished mapping

I tried to change to CALI_CONFIG='runtime-report(topdown.toplevel)' and this one runs, but it returned

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
Finished mapping
cali-query: TreeFormatter: Attribute "any#any#topdown.retiring" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.backend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.frontend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.bad_speculation" not found.
Path       Min time/rank Max time/rank Avg time/rank Time %   
main            0.000878      0.000878      0.000878 1.620702 
  parallel      0.001954      0.001954      0.001954 3.607063

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 4, 2024

Now that's interesting. It looks like there's some issue with recognizing the Caliper option specs for topdown. Let me dig into this.

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

@jessdagostini could you try rebuilding and re-running with runtime-report(topdown-counters.all)? When you re-run, please add the following environment variable CALI_LOG_VERBOSITY=3. That should give more debugging output that could give me a better sense of what's going on.

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

Also, can you share the output of spack arch?

@jessdagostini
Copy link

Ok! I re-installed Caliper using Spack. I am testing with the small test case just to check if we get outputs.
Seems that it did not recognize topdown-counter.all again

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: CALI_CONFIG: error: Unknown option: topdown-counters.all
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: Releasing Caliper global data.
  Max active channels: 0
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 30 nodes
   Metadata memory pool: 1 MiB reserved, 18.166 KiB used
  Thread blackboard: max 0 entries (0% occupancy).
== CALIPER: Finished

If I run with topdown.toplevel the output gets bigger

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing
== CALIPER: papi: Found 5 event codes for 1 PAPI component(s)
== CALIPER: papi: Creating eventset with 5 events for component 0 (perf_event)
== CALIPER: papi: Initializing multiplex support for component 0 (perf_event)
== CALIPER: runtime-report: Registered papi service
== CALIPER: runtime-report: Registered topdown service. Level: top.
== CALIPER: Configuration:
CALI_AGGREGATE_KEY=
CALI_CHANNEL_FLUSH_ON_EXIT=false
CALI_CHANNEL_CONFIG_CHECK=true
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_EVENT_EXCLUDE_REGIONS=
CALI_EVENT_INCLUDE_BRANCHES=
CALI_EVENT_INCLUDE_REGIONS=
CALI_EVENT_ENABLE_SNAPSHOT_INFO=false
CALI_EVENT_REGION_LEVEL=0
CALI_EVENT_TRIGGER=
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_MPIREPORT_WRITE_ON_FINALIZE=false
CALI_MPIREPORT_LOCAL_CONFIG= let sum#time.duration=scale(sum#time.duration.ns,1e-9),o_a_v.slot=first(aggregate.slot) select  sum(sum#time.duration),any(topdown.retiring) as "Retiring",any(topdown.backend_bound) as "Backend bound",any(topdown.frontend_bound) as "Frontend bound",any(topdown.bad_speculation) as "Bad speculation" group by path aggregate min(o_a_v.slot) order by min#o_a_v.slot
CALI_MPIREPORT_CONFIG= select  min(sum#sum#time.duration) as "Min time/rank",max(sum#sum#time.duration) as "Max time/rank",avg(sum#sum#time.duration) as "Avg time/rank",percent_total(sum#sum#time.duration)  as "Time %",any(any#topdown.retiring) as "Retiring",any(any#topdown.backend_bound) as "Backend bound",any(any#topdown.frontend_bound) as "Frontend bound",any(any#topdown.bad_speculation) as "Bad speculation" group by path aggregate min(min#o_a_v.slot) order by min#min#o_a_v.slot format tree()
CALI_MPIREPORT_APPEND=true
CALI_MPIREPORT_FILENAME=stderr
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=CPU_CLK_THREAD_UNHALTED:THREAD_P,IDQ_UOPS_NOT_DELIVERED:CORE,INT_MISC:RECOVERY_CYCLES,UOPS_ISSUED:ANY,UOPS_RETIRED:RETIRE_SLOTS
CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timer,topdown
CALI_TIMER_INCLUSIVE_DURATION=false
CALI_TOPDOWN_LEVEL=top
== CALIPER: runtime-report: event: Marked attribute mpi.function
== CALIPER: Registered builtin ConfigManager
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: builtin.configmgr: Flushing Caliper data
== CALIPER: runtime-report: Flushing Caliper data
== CALIPER: runtime-report: Aggregate: flushed 3 snapshots.
cali-query: TreeFormatter: Attribute "any#any#topdown.retiring" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.backend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.frontend_bound" not found.
cali-query: TreeFormatter: Attribute "any#any#topdown.bad_speculation" not found.
Path       Min time/rank Max time/rank Avg time/rank Time %   
main            0.001979      0.001979      0.001979 3.489282 
  parallel      0.001901      0.001901      0.001901 3.351880 
== CALIPER: CaliperMetadataDB: stored 86 nodes, 59 strings.
== CALIPER: Releasing channel builtin.configmgr
== CALIPER: builtin.configmgr: Finishing mpi service
== CALIPER: Releasing channel runtime-report
== CALIPER: runtime-report: Aggregate: Releasing aggregation DB.
  max hash len: 1, 4 entries, 24 kernels, 1.5 MiB reserved.
== CALIPER: runtime-report: Finishing mpi service
== CALIPER: runtime-report: papi: Finishing
== CALIPER: runtime-report: papi: Created 1 PAPI event set(s) on 1 thread(s).
== CALIPER: papi: Shutdown
== CALIPER: runtime-report: topdown: Computed topdown metrics for 0 records, skipped 3
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      0 computed, 3 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.
== CALIPER: builtin.configmgr channel blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: runtime-report channel blackboard: max 1 entries (0.0979432% occupancy).
== CALIPER: Releasing Caliper global data.
  Max active channels: 1
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 139 nodes
   Metadata memory pool: 1 MiB reserved, 20.1582 KiB used
  Thread blackboard: max 3 entries (0.29383% occupancy).
== CALIPER: Finished

The output of my spack arch is linux-ubuntu22.04-cascadelake

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

That helps a lot. It looks like there's an issue when accessing data from the PAPI service. Most likely, I have a typo in one or more of the counter names. I'll look at that.

I'm not sure why topdown-counters.all is not working though.

@jessdagostini
Copy link

Ok, have an update. I think my example was too small and there were no enough collection.
If I run with the full test case, using topdown.toplevel I got results

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing
== CALIPER: papi: Found 5 event codes for 1 PAPI component(s)
== CALIPER: papi: Creating eventset with 5 events for component 0 (perf_event)
== CALIPER: papi: Initializing multiplex support for component 0 (perf_event)
== CALIPER: runtime-report: Registered papi service
== CALIPER: runtime-report: Registered topdown service. Level: top.
== CALIPER: Configuration:
CALI_AGGREGATE_KEY=
CALI_CHANNEL_FLUSH_ON_EXIT=false
CALI_CHANNEL_CONFIG_CHECK=true
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_EVENT_EXCLUDE_REGIONS=
CALI_EVENT_INCLUDE_BRANCHES=
CALI_EVENT_INCLUDE_REGIONS=
CALI_EVENT_ENABLE_SNAPSHOT_INFO=false
CALI_EVENT_REGION_LEVEL=0
CALI_EVENT_TRIGGER=
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_MPIREPORT_WRITE_ON_FINALIZE=false
CALI_MPIREPORT_LOCAL_CONFIG= let sum#time.duration=scale(sum#time.duration.ns,1e-9),o_a_v.slot=first(aggregate.slot) select  sum(sum#time.duration),any(topdown.retiring) as "Retiring",any(topdown.backend_bound) as "Backend bound",any(topdown.frontend_bound) as "Frontend bound",any(topdown.bad_speculation) as "Bad speculation" group by path aggregate min(o_a_v.slot) order by min#o_a_v.slot
CALI_MPIREPORT_CONFIG= select  min(sum#sum#time.duration) as "Min time/rank",max(sum#sum#time.duration) as "Max time/rank",avg(sum#sum#time.duration) as "Avg time/rank",percent_total(sum#sum#time.duration)  as "Time %",any(any#topdown.retiring) as "Retiring",any(any#topdown.backend_bound) as "Backend bound",any(any#topdown.frontend_bound) as "Frontend bound",any(any#topdown.bad_speculation) as "Bad speculation" group by path aggregate min(min#o_a_v.slot) order by min#min#o_a_v.slot format tree()
CALI_MPIREPORT_APPEND=true
CALI_MPIREPORT_FILENAME=stderr
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=CPU_CLK_THREAD_UNHALTED:THREAD_P,IDQ_UOPS_NOT_DELIVERED:CORE,INT_MISC:RECOVERY_CYCLES,UOPS_ISSUED:ANY,UOPS_RETIRED:RETIRE_SLOTS
CALI_SERVICES_ENABLE=aggregate,event,mpi,mpireport,timer,topdown
CALI_TIMER_INCLUSIVE_DURATION=false
CALI_TOPDOWN_LEVEL=top
== CALIPER: runtime-report: event: Marked attribute mpi.function
== CALIPER: Registered builtin ConfigManager
Finished mapping
== CALIPER: Finalizing ... 
== CALIPER: builtin.configmgr: Flushing Caliper data
== CALIPER: runtime-report: Flushing Caliper data
== CALIPER: runtime-report: Aggregate: flushed 3 snapshots.
Path       Min time/rank Max time/rank Avg time/rank Time %    Retiring Backend bound Frontend bound Bad speculation 
main           11.536610     11.536610     11.536610 20.660444 0.540284      0.230441       0.179589        0.049686 
  parallel     44.253396     44.253396     44.253396 79.251601 0.424175      0.235877       0.251795        0.088152 
== CALIPER: CaliperMetadataDB: stored 107 nodes, 75 strings.
== CALIPER: Releasing channel builtin.configmgr
== CALIPER: builtin.configmgr: Finishing mpi service
== CALIPER: Releasing channel runtime-report
== CALIPER: runtime-report: Aggregate: Releasing aggregation DB.
  max hash len: 1, 4 entries, 24 kernels, 1.5 MiB reserved.
== CALIPER: runtime-report: Finishing mpi service
== CALIPER: runtime-report: papi: Finishing
== CALIPER: runtime-report: papi: Created 1 PAPI event set(s) on 1 thread(s).
== CALIPER: papi: Shutdown
== CALIPER: runtime-report: topdown: Computed topdown metrics for 2 records, skipped 1
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      2 computed, 1 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.
== CALIPER: builtin.configmgr channel blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: runtime-report channel blackboard: max 1 entries (0.0979432% occupancy).
== CALIPER: Releasing Caliper global data.
  Max active channels: 1
== CALIPER: Process blackboard: max 2 entries (0.195886% occupancy).
== CALIPER: Releasing Caliper thread data: 
  Metadata tree: 1 blocks, 139 nodes
   Metadata memory pool: 1 MiB reserved, 20.1582 KiB used
  Thread blackboard: max 3 entries (0.29383% occupancy).
== CALIPER: Finished

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

Interesting. There's still one thing in that output that concerns me.

== CALIPER: runtime-report: topdown: Computed topdown metrics for 2 records, skipped 1
== CALIPER: runtime-report: topdown: Records processed per topdown level: 
  top:      2 computed, 1 skipped,
  bad spec: 0 computed, 0 skipped,
  frontend: 0 computed, 0 skipped,
  backend:  0 computed, 0 skipped.

This output means that it tried to calculate three sets of topdown metrics. Usually, there would be one calculation per Caliper region. However, you only have two regions, so I'm not sure why it tried to do three calculations.

@daboehme, do you have any idea what could be going on with this and the topdown-counters thing?

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

@jessdagostini when you get the chance, could you try re-running with topdown.all and CALI_LOG_VERBOSITY=3? I'm still not entirely sure why you were getting such high runtimes with topdown.all, but the extra logging should help figure out the issue. If it's taking longer than ~5 minutes when you run it, just kill the program and share whatever you got from Caliper's logging.

@daboehme
Copy link
Member

daboehme commented Dec 5, 2024

@daboehme, do you have any idea what could be going on with this and the topdown-counters thing?

So the skipped record is probably for the first region begin event in the program, which won't even show up in the profile output. You can safely ignore that.

I took out the topdown-counters option recently since we never used it. You can try it manually with a config like so, substituting the list of counters:

CALI_SERVICES_ENABLE=event,trace,recorder,papi,timer
CALI_PAPI_ENABLE_MULTIPLEXING=true
CALI_PAPI_COUNTERS=(the list of counters)
CALI_LOG_VERBOSITY=3

The program being stuck is strange indeed. With just two Caliper regions outside of the parallel loop there should essentially be no overhead. Not sure if the PAPI multiplexing does something weird. Though we were able to use it with reasonably long running programs in the past.

@ilumsden
Copy link
Contributor Author

ilumsden commented Dec 5, 2024

Oh, I didn't realize that about topdown-counters. I can remove those.

As for the hanging/slowdown, PAPI multiplexing overheads were my first thought too. I'm curious about the logging output just to make sure it's nothing else.

@jessdagostini what version of PAPI are you using?

@jessdagostini
Copy link

jessdagostini commented Dec 5, 2024

I put a topdown.all trial to run yesterday night, before leaving lab. For my surprise, it was still running when I arrived today.
image

The Caliper logs for this run are

Reading seeds
Reading GBZ
Starting mapping with 512 batch size
48
== CALIPER: Available services: aggregate,alloc,async_event,cpuinfo,debug,env,event,io,loop_monitor,loop_statistics,memstat,mpi,mpiflush,mpireport,papi,pthread,recorder,region_monitor,report,sampler,statistics,sysalloc,textlog,timer,timeseries,timestamp,topdown,trace,validator
== CALIPER: Initialized
== CALIPER: No manual config specified, disabling default channel
== CALIPER: Creating channel builtin.configmgr
== CALIPER: builtin.configmgr: mpiwrap: Using GOTCHA wrappers.
== CALIPER: builtin.configmgr: Registered MPI service
== CALIPER: builtin.configmgr: Registered mpiflush service
== CALIPER: Configuration:
CALI_CHANNEL_FLUSH_ON_EXIT=true
CALI_CHANNEL_CONFIG_CHECK=false
CALI_CONFIG_FILE=caliper.config
CALI_CONFIG_PROFILE=default
CALI_MPI_MSG_PATTERN=false
CALI_MPI_MSG_TRACING=false
CALI_MPI_WHITELIST=
CALI_MPI_BLACKLIST=
CALI_SERVICES_ENABLE=mpi,mpiflush
== CALIPER: Creating channel runtime-report
== CALIPER: runtime-report: Registered aggregation service
== CALIPER: runtime-report: event: Using region level 0
== CALIPER: runtime-report: event: Marked attribute comm.region
== CALIPER: runtime-report: event: Marked attribute loop
== CALIPER: runtime-report: event: Marked attribute phase
== CALIPER: runtime-report: event: Marked attribute region
== CALIPER: runtime-report: Registered event trigger service
== CALIPER: runtime-report: mpiwrap: Using GOTCHA wrappers.
== CALIPER: runtime-report: Registered MPI service
== CALIPER: runtime-report: Registered mpireport service
== CALIPER: runtime-report: Registered timer service
== CALIPER: papi: Enabling multiplexing

I killed the process now, and no other log were added. Indeed, the last thing here is PAPI multiplexing. The version that was installed with spack dependencies for Caliper is [email protected]. I am sharing here the whole dependencies stack of my installation just in case. All of them are installed using GCC 11.3.0

@jessdagostini
Copy link

jessdagostini commented Jan 10, 2025

Hi @ilumsden
Do you have any updates on this? Is there other information I can share to help? :)

@ilumsden
Copy link
Contributor Author

Hey @jessdagostini. Sorry for the delay. I had to shift my focus towards other things for a bit (especially a time sensitive project where we're trying to ramp up to running on El Cap before it goes into SCF). I should have time to work on Cascade Lake support this week.

One quick thing that I just thought of that you could try is upgrading the version of PAPI. 5.7 is a pretty old version (current is in the 7.X series), so I wonder if there was some sort of bug that got fixed in newer versions.

@ilumsden
Copy link
Contributor Author

@jessdagostini I've done a bit more work on this. I can get the current version to run on LLNL's Ruby system using PAPI 6.0.0.1. The values I'm getting are not correct right now, but I may not be running long enough to get valid data from PAPI. I'll need to look into that before I can determine if the invalid values are due to the runtime being too short or due to bad logic.

@jessdagostini
Copy link

No worries! I was just wondering if we could try to look over it again since it will be nice to collect some data from my application using Caliper 😄
Got it! I will try to reproduce your execution using PAPI 6.0 and will share here! Thanks for updating things here!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants