-
Notifications
You must be signed in to change notification settings - Fork 697
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a performance measuring top-level user guide page #10539
base: master
Are you sure you want to change the base?
Conversation
Thanks a bunch for starting this. Several things:
ghc-prof-options: -fprof-auto -fno-prof-count-entries -fprof-auto-calls
|
Thanks for the quick review @Kleidukos and @geekosaur. I've added all your review remarks. Yes, the json output can be loaded into speedscope directly. This https://github.com/mpickering/hs-speedscope only deals with |
9839ef1
to
0370522
Compare
0370522
to
8988bee
Compare
@malteneuss Much better, only two more changes to make and we should be good to go! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that eventlog profiling is far superior than just -pj
, and it is (for recent GHCs) just a matter of passing -l -p
instead (or if a smaller eventlog is wanted -l-agu -p
). I have personally never used the json profiling report.
77341b0
to
d326c39
Compare
@jasagredo Do you have a link to a post or blog that discusses the differences? I have never used it. Would you like to add a second section showing how to use it in another MR? I think we should add another section for profiling memory, where eventlog should be discussed after all. |
Thank you for taking this on! It's amazing to see efforts in expanding the guide section! I have certain critique below but, please, realize that documentation is a matter where reasonable people can disagree. I can survive the current version, especially given that you already got the mandatory two approvals. IMO the text uses a
I propose solving most (all) of these with the following reformatting.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I temporarily block it to make sure you have enough time to respond.
Re-reading the document again I am unsure this describes Cabal's job on profiling. I think the section should be more about what options does cabal need to enable profiling than how to produce a profiling report, therefore I think the title is misleading and we are stepping into GHC's User Guide territory. My suggestion would be to leave most of the description and RTS options to the GHC User guide and describe only the following in the text:
This way the choice of p vs pj, or what files are produced or where to load them is deferred to the User Guide, as it is GHC-specific business. This way we don't mix "How to profile" and "How to configure for profiling". The latter is Cabal's job, but the former is GHC or general Haskell information which should not live in Cabal's docs I think. It will always be too specific or too general what you can say there and it might depend on GHC versions and so on. |
OTOH, for some users a full example session of profiling would be immensely valuable and precisely what they are looking for, in despair reaching even for the cabal documentation. So maybe at least link to the original discourse post, saying it contains practical examples, including how to tweak GHC to profile best, which is out of scope of the cabal guide? |
I think this deserves a place in something like https://haskell.foundation/hs-opt-handbook.github.io, not in cabal's documentation. Cabal docs should talk about how to configure cabal for the different profiling options, but not about how to profile or interpret the results. |
@jasagredo I disagree part of what you said. The cabal manual should describe how to operate cabal. Interpretation of the results certainly deserves to be centralised in the optimisation handbook but producing a profile is certainly well within what one could expect of the cabal manual. |
I think there is a hope that Cabal's docs could tell you how to do everything, but IMHO that is not how it should be. For example I think Rust usually has had a much more comprehensible documentation than Haskell, and they usually made very correct choices in my opinion. The equivalent to what we are discussing here is this page on "The Cargo Book" https://doc.rust-lang.org/cargo/reference/profiles.html which describes the options that exist to customize optimization or debug information. Then it defers most of the information to "The rustc book" which describes how each option works. How to do profiling is covered in "The Rust Performance book" https://nnethercote.github.io/perf-book/profiling.html which links to different tools to produce and analyze profiles. Why is this different than in Haskell? Because Rust performance can be analyzed by standard tools, whereas in Haskell it is the GHC RTS the one that produces its own reports, and third party tools the ones that interpret those reports in different ways. GHC's User Guide already speaks about how to produce reports and gives a brief outline on how to analyze them, but it is mostly via 3rd party tools that those reports are consumed (profiteur, eventlog2html, hp2pretty/hp2ps, speedscope, ...). It therefore is reasonable that each tool explains their business, and having a central place that outlines the overall complete process, which I think is the Haskell Optimization Handbook. In any case, both producing and analyzing documentation live outside of "The Cargo Book" in the Rust ecosystem, which I think is the right choice, and as such I would argue we could do the same and leave this outside of the cabal documentation. |
Maybe I should re-phrase my suggestion after my wall of text above (sorry for that):
|
48cb15a
to
c0aacca
Compare
Here's the second proposal after having it streamlined similar to @ulysses4ever suggestions. I don't mention I'm with @Mikolaj and @tchoutri on having a full example session (if i need such a guide, i don't want to scramble snippets together from different places) and that the main interface where we configure things is Cabal here. However, i emphasize now that Cabal does only a configuring part, and GHC the actual work; and i mention the optimization handbook. |
|
correct. Even simpler rule of thumb is: cabal.project holds options that are useful to store in Git and cabal.project.local is for local experiments (like profiling) that, in general, shouldn't be committed to Git. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is going forward anyways, let me add some suggestions to improve it. I won't "Request changes" so feel free to disregard my comments.
Finally, a profiling JSON report is written to a ``<app-name>.prof`` file, | ||
i.e. ``my-app.prof``, in the current directory. | ||
Load the profiling report file ``my-app.prof`` into a visualizer | ||
and look for performance bottlenecks. One popular open-source | ||
`flame graph <https://www.brendangregg.com/flamegraphs.html>`__ | ||
visualizer is | ||
`Speedscope <https://speedscope.app>`__, | ||
which runs in the browser and can open this JSON file directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention something like:
-pj
produces JSON output, can be visualized inspeedscope
.-p
produces GHC's own.prof
format, can be visualized inprofiteur
orghcprofview
.-l -p
produces an eventlog that can be visualized inspeedscope
by first converting it viahs-speedscope
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I'm not 100% sure of that I'm about to say, but my understanding is that -p
and -pj
shows total time spent, and even in speedscope you see totals. One can see how much time a function took, but not when.
I think the eventlog path shows totals but also shows when this happened, i.e. information will be interleaved with GCs or switching contexts, and split by capabilities. IMHO this is more useful and the thing I usually use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the three examples, although i think you might have meant -l-p
, because -l -p
(with a space inbetween) just produces two separate files .eventlog
and .prof
.
Co-authored-by: Javier Sagredo <[email protected]>
Co-authored-by: Javier Sagredo <[email protected]>
I added the review remarks so please have a look at it again. |
Another step of the user guide improvement initiative #9214:
Add a simple profiling user-guide page based on https://discourse.haskell.org/t/ghc-profiling-a-cabal-project-with-an-interactive-application/10465/2?u=malteneuss, which the author allowed us to use.
Feel free to modify yourself if that is faster.