-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds topdown support for Skylake and Cascade Lake #629
base: master
Are you sure you want to change the base?
Conversation
First of all, thanks for addressing this functionality so quickly @ilumsden! I have started some tests using Caliper on an Intel Xeon Platinum 8260, which is a Cascade Lake processor. I am trying to collect topdown results from a pangenome application (bioinformatics code).
Running with
However, when running with UPDATE: |
That's really weird. I definitely would not expect this to add that much time. There's one thing I can think of checking in the topdown service, but, in the meantime, can you try running with |
@ilumsden Yes! Did that, it finished, and this is the output
I tried to change to
|
Now that's interesting. It looks like there's some issue with recognizing the Caliper option specs for topdown. Let me dig into this. |
@jessdagostini could you try rebuilding and re-running with |
Also, can you share the output of |
Ok! I re-installed Caliper using Spack. I am testing with the small test case just to check if we get outputs.
If I run with
The output of my |
That helps a lot. It looks like there's an issue when accessing data from the PAPI service. Most likely, I have a typo in one or more of the counter names. I'll look at that. I'm not sure why |
Ok, have an update. I think my example was too small and there were no enough collection.
|
Interesting. There's still one thing in that output that concerns me.
This output means that it tried to calculate three sets of topdown metrics. Usually, there would be one calculation per Caliper region. However, you only have two regions, so I'm not sure why it tried to do three calculations. @daboehme, do you have any idea what could be going on with this and the |
@jessdagostini when you get the chance, could you try re-running with |
So the skipped record is probably for the first region begin event in the program, which won't even show up in the profile output. You can safely ignore that. I took out the topdown-counters option recently since we never used it. You can try it manually with a config like so, substituting the list of counters:
The program being stuck is strange indeed. With just two Caliper regions outside of the parallel loop there should essentially be no overhead. Not sure if the PAPI multiplexing does something weird. Though we were able to use it with reasonably long running programs in the past. |
Oh, I didn't realize that about As for the hanging/slowdown, PAPI multiplexing overheads were my first thought too. I'm curious about the logging output just to make sure it's nothing else. @jessdagostini what version of PAPI are you using? |
I put a The Caliper logs for this run are
I killed the process now, and no other log were added. Indeed, the last thing here is PAPI multiplexing. The version that was installed with spack dependencies for Caliper is [email protected]. I am sharing here the whole dependencies stack of my installation just in case. All of them are installed using GCC 11.3.0
|
Hi @ilumsden |
Hey @jessdagostini. Sorry for the delay. I had to shift my focus towards other things for a bit (especially a time sensitive project where we're trying to ramp up to running on El Cap before it goes into SCF). I should have time to work on Cascade Lake support this week. One quick thing that I just thought of that you could try is upgrading the version of PAPI. 5.7 is a pretty old version (current is in the 7.X series), so I wonder if there was some sort of bug that got fixed in newer versions. |
… out how to access UOPS_RETIRED:MACRO_FUSED
@jessdagostini I've done a bit more work on this. I can get the current version to run on LLNL's Ruby system using PAPI 6.0.0.1. The values I'm getting are not correct right now, but I may not be running long enough to get valid data from PAPI. I'll need to look into that before I can determine if the invalid values are due to the runtime being too short or due to bad logic. |
No worries! I was just wondering if we could try to look over it again since it will be nice to collect some data from my application using Caliper 😄 |
This PR is a follow up to some discussions had during SC between me and @jessdagostini.
This PR adds topdown support for Intel Skylake (and derivative architectures like Cascade Lake) processors. Support can be enabled by setting
-DWITH_ARCH
to one of the following:"skylake"
"skylake_avx512"
"cascadelake"
As with the Sapphire Rapids support, users will not need to set this manually when using Spack.
Currently, all of the level 1 and level 2 metrics are available through the
topdown.top
andtopdown.all
runtime configurations. Similar to Sapphire Rapids, users can get the values for the underlying counters used to calculate the topdown metrics using thetopdown-counters.top
andtopdown-counters.all
configurations.