Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DYNAREC] Implement perf map #2212

Merged
merged 2 commits into from
Dec 26, 2024
Merged

Conversation

xiangzhai
Copy link
Contributor

@xiangzhai xiangzhai commented Dec 26, 2024

Hi,

To find out DynaRec jit "hot" code:

perf record -e cpu-cycles box64 scimark4-x64

scimark: https://math.nist.gov/scimark2/download_c.html

Before no perf map no Symbol:

Samples: 133K of event 'cpu-cycles:u', Event count (approx.): 66380844254
Overhead  Command       Shared Object          Symbol
   2.38%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c3888
   2.35%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c388c
   1.18%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c62ec
   1.14%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c62fc
   0.96%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c62ac
   0.94%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0c7c
   0.88%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0c6c
   0.82%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0b7c
   0.80%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0b74
   0.78%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0cdc
   0.76%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c60ec
   0.73%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0c74
   0.68%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0b78
   0.62%  scimark4-x64  perf-19878.map         [.] 0x000000fff42bcec0
   0.62%  scimark4-x64  perf-19878.map         [.] 0x000000fff42bceb0
   0.57%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0c78
   0.57%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0c80
   0.56%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c6114
   0.54%  scimark4-x64  perf-19878.map         [.] 0x000000fff42bcef0
   0.52%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c0bf4
   0.52%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c625c
   0.50%  scimark4-x64  perf-19878.map         [.] 0x000000fff42c65c4
...

After generated perf map just use x86 INST NAME as the Symbol name:

Samples: 133K of event 'cpu-cycles:u', Event count (approx.): 66395424841
Overhead  Command       Shared Object          Symbol
   2.39%  scimark4-x64  perf-20589.map         [.] 0x000000fff306388c
   2.31%  scimark4-x64  perf-20589.map         [.] 0x000000fff3063888
   1.14%  scimark4-x64  perf-20589.map         [.] 0x000000fff30662ec
   1.13%  scimark4-x64  perf-20589.map         [.] 0x000000fff30662fc
   0.95%  scimark4-x64  perf-20589.map         [.] ADD Ed, Ib
   0.93%  scimark4-x64  perf-20589.map         [.] 0x000000fff3060c7c
   0.89%  scimark4-x64  perf-20589.map         [.] 0x000000fff3060c6c
   0.86%  scimark4-x64  perf-20589.map         [.] RET
   0.82%  scimark4-x64  perf-20589.map         [.] 0x000000fff3060b7c
   0.81%  scimark4-x64  perf-20589.map         [.] MOVSD Ex, Gx
   0.78%  scimark4-x64  perf-20589.map         [.] JZ ib
   0.75%  scimark4-x64  perf-20589.map         [.] 0x000000fff3060b74
   0.75%  scimark4-x64  perf-20589.map         [.] 0x000000fff3060c74
   0.67%  scimark4-x64  perf-20589.map         [.] ADDSD Gx, Ex
   0.62%  scimark4-x64  perf-20589.map         [.] SUBSD Gx, Ex
   0.61%  scimark4-x64  perf-20589.map         [.] UNPCKHPD Gx, Ex
   0.60%  scimark4-x64  perf-20589.map         [.] MULSD Gx, Ex
   0.57%  scimark4-x64  perf-20589.map         [.] JZ ib
   0.56%  scimark4-x64  perf-20589.map         [.] ADD Ed, Gd
   0.55%  scimark4-x64  perf-20589.map         [.] 0x000000fff306625c
   0.53%  scimark4-x64  perf-20589.map         [.] ADDSD Gx, Ex
...

Please review my patch and give some suggestion.

Thanks,
Leslie Zhai

@ksco
Copy link
Collaborator

ksco commented Dec 26, 2024

Thank you! While it's cool, in my opinion, I don't think showing the x86 instruction name is of much use to us. Perf maps are mainly used for perfing guest programs to make it easier to find hot code, so it will be useful for users to find hot code in Java or Python programs with this implemented in JVM or Python VM, but it is not helpful for JVM developers.

But in any case, I will wait for @ptitSeb to review this.

@ptitSeb
Copy link
Owner

ptitSeb commented Dec 26, 2024

I would prefer the perf-map to represent the name of the x64 function (if any), or the x64 address for anonymous map (that could be later named after the file name map when this will be tracked properly), and then use a simple hex offset for each instruction instead of the opcode name that is kind of anonymous at tis level.

If you think you can easily do the change, I can wait. I you prefer this pr to be merge and look at a "2" level with function name, I can merge this quickly (but indeed, I don't see this function really usefull in it's current state)

@xiangzhai
Copy link
Contributor Author

I would prefer the perf-map to represent the name of the x64 function

Yup, JVM use the function alike (method or jvmci) name: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/code/codeCache.cpp#L1833
I will change the Symbol to function name.

Thanks,
Leslie Zhai

@xiangzhai
Copy link
Contributor Author

Use function name as the Symbol:

Samples: 133K of event 'cpu-cycles:u', Event count (approx.): 66363113293
Overhead  Command       Shared Object          Symbol
   2.42%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bdb88c
   2.27%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bdb888
   1.18%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bde2ec
   1.14%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bde2fc
   0.87%  scimark4-x64  perf-27184.map         [.] SparseCompRow_matmult
   0.84%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.83%  scimark4-x64  perf-27184.map         [.] Random_nextDouble
   0.81%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.80%  scimark4-x64  perf-27184.map         [.] SparseCompRow_matmult
   0.77%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.69%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.68%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8c6c
   0.65%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8cd8
   0.64%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8b74
   0.64%  scimark4-x64  perf-27184.map         [.] FFT_transform_internal
   0.62%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8c74
   0.62%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8c7c
   0.61%  scimark4-x64  perf-27184.map         [.] FFT_transform_internal
   0.61%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bd8b7c
   0.60%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.60%  scimark4-x64  perf-27184.map         [.] SparseCompRow_matmult
   0.57%  scimark4-x64  perf-27184.map         [.] FFT_transform_internal
   0.51%  scimark4-x64  perf-27184.map         [.] SOR_execute
   0.50%  scimark4-x64  perf-27184.map         [.] 0x000000fff4bde5e4

Please review my patch again.

Thanks,
Leslie Zhai

Copy link
Collaborator

@ksco ksco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me now.

@ksco ksco requested a review from ptitSeb December 26, 2024 13:42
@ptitSeb
Copy link
Owner

ptitSeb commented Dec 26, 2024

LGTM too.

@ptitSeb ptitSeb merged commit 8971399 into ptitSeb:main Dec 26, 2024
27 checks passed
@xiangzhai
Copy link
Contributor Author

Thanks 🍻

@xiangzhai xiangzhai deleted the dynarec_perf_map branch December 27, 2024 00:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants