Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver: Record execution time of every commands #978

Merged
merged 15 commits into from
Oct 2, 2023
Merged

Conversation

Julow
Copy link
Collaborator

@Julow Julow commented Jun 26, 2023

Record the time taken by every commands and save the result in current-bench's JSON format.

The running time of every sub-commands are recorded (min, max, avg) and the total number of times they are called.

Most of the diffs are due to more recent dependencies, the code computing the metrics is at the bottom.

The benchmark results look like this: (but all on one line)

{
    "name": "odoc",
    "results": [
        {
            "name": "driver.mld",
            "metrics": [
                {
                    "name": "total-compile",
                    "value": 523,
                    "description": "Number of time 'odoc compile' has run."
                },
                {
                    "name": "time-compile",
                    "value": {
                        "min": 0.00010799999999988596,
                        "max": 0.00363099999999994,
                        "avg": 0.0001435755258126284
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc compile'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-compile-deps",
                    "value": 492,
                    "description": "Number of time 'odoc compile-deps' has run."
                },
                {
                    "name": "time-compile-deps",
                    "value": {
                        "min": 0.00012300000000009526,
                        "max": 0.0014290000000003467,
                        "avg": 0.00019406300813008646
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc compile-deps'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-link",
                    "value": 524,
                    "description": "Number of time 'odoc link' has run."
                },
                {
                    "name": "time-link",
                    "value": {
                        "min": 0.00010399999999943788,
                        "max": 0.003881999999999941,
                        "avg": 0.0002171736641221395
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc link'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-html-generate",
                    "value": 524,
                    "description": "Number of time 'odoc html-generate' has run."
                },
                {
                    "name": "time-html-generate",
                    "value": {
                        "min": 0.00010300000000018628,
                        "max": 0.003909000000000162,
                        "avg": 0.00021438167938932483
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc html-generate'",
                    "trend": "lower-is-better"
                }
            ]
        }
    ]
}

Problems:

  • Are the times correct ? Summing up all the numbers give a total running time of 0.5s while it takes 26s to run @docgen on my machine.
    I know Mdx is slow to run bytecode and process the output of top-level blocks, but that's extreme.

  • dune build @bench fails on Dune 3.7.1 with this message, even though the documentation says that <file1> doesn't have to exist:

    Error: Unable to resolve symlink _build/default/driver-benchmarks.json

@jonludlam
Copy link
Member

The original reason for adding in the extra libraries for the benchmark was that core/core_kernel is considerably more complex than base, and therefore provided some more coverage of odoc's code. I'm a bit reluctant to remove it without a solid reason for doing so.

@jonludlam
Copy link
Member

There are a few other annoyances associated with the driver as-is, not things you're introducing in this commit, obviously:

  • we capture the details of the opam switch in the output. We could possibly do a substitution in the logs and put in $(opam config var lib) where we've got the opam lib dir? Though we'd need to make the paths absolute rather than relative for that to work.
  • we should pin the exact versions of the libraries being used for the driver, which would help the output be reproducable. This will be a bit weird since we're not making these constraints part of the opam file, but for keeping the docs site consistent it'd be useful.

@Julow
Copy link
Collaborator Author

Julow commented Jun 29, 2023

I don't believe that the quality would increase if it runs for longer. It would have a huge input and output, I'm worried that we are only measuring the file system caches.
The main goal of this benchmark is to passively measure the evolution in performances on the long term. I'm worried about wasted resources.

To increase the quality of the result, we could perhaps do a warmup run ? This is not reasonable with a very long benchmark.

I don't think coverage is important when everything is averaged, especially if the atypical code is so much bigger. Perhaps we would make a different benchmark for Core alone ? So that we measure only the atypical code.

@jonludlam
Copy link
Member

It's not to do with having longer runs, it's to do with the complexity of core vs base. The point of the benchmark is precisely to measure the atypical code, since it's pretty easy for regressions to creep in there unnoticed. Some early versions of odoc ran quite quickly on base and took many minutes to run on core_kernel - and I do actually run this benchmark test whenever I'm working on the performance sensitive bits odoc.

@Julow
Copy link
Collaborator Author

Julow commented Jul 3, 2023

I did not realize it was used, I don't want to remove it then. Reverted.

@Julow
Copy link
Collaborator Author

Julow commented Jul 3, 2023

we should pin the exact versions of the libraries being used for the driver, which would help the output be reproducable. This will be a bit weird since we're not making these constraints part of the opam file, but for keeping the docs site consistent it'd be useful.

To avoid the maintenance cost, I suggest to remove the logs of commands.
It could be replaced by logging only commands affecting Odoc's modules, with options such as the search path removed.

@jonludlam
Copy link
Member

We're using this driver for a few different use cases now - benchmarking / testing / creating the odoc website. I quite like having the concrete commands recorded for the purposes of the website, but it definitely gets in the way of day-to-day testing. We've got the env var turning on/off the extra dependencies - we could perhaps use the same mechanism to turn the logging of the commands on and off. E.g. ODOC_DRIVER_MODE could be BENCH or TEST or DOC or something?

@Julow
Copy link
Collaborator Author

Julow commented Jul 3, 2023

I'm not a fan of making the output even more non deterministic. The list of commands is huge. Do you expect anyone to use it ?

Ideally, we'd have a different instance for different purposes. One for current-bench, one for building core and one for the doc.
We could do that with Mdx file includes, which independently from the pro and cons, would allow using the same functions from different md files.

@jonludlam
Copy link
Member

The list of commands is huge. Do you expect anyone to use it ?

Well, yes. I used it myself it a few weeks back when doing the voodoo prototype for source rendering. It's way easier to read the actual shell invocations of odoc than to simply read the function that assembles the command line.

We could do that with Mdx file includes, which independently from the pro and cons, would allow using the same functions from different md files.

The thing that makes this a struggle is that the main use of this code is to be very clear about how to use odoc, so I'm not keen on splitting it up into multiple files - it makes for more tricky reading. Adding some straightforward logic to log or not won't make the code any harder to read.

I'm not a fan of making the output even more non deterministic.

Surely this makes is more deterministic!?

@Julow
Copy link
Collaborator Author

Julow commented Jul 4, 2023

so I'm not keen on splitting it up into multiple files

Mdx have a way to include code from different parts of one .ml file (with delimiting comments). The documentation would stay the same.

Would you mind if I try ?

Adding some straightforward logic to log or not won't make the code any harder to read.

I slightly disagree. Adding any amount of code makes it harder to read.

Surely this makes is more deterministic!?

Sure but that doesn't solve any problem. There will always be a huge diff, the exit status will always be an error.

@jonludlam
Copy link
Member

Sure but that doesn't solve any problem. There will always be a huge diff, the exit status will always be an error.

Ah, what I meant was that we would turn everything off for the common case of testing, and only turn the logging on for when we're building in order to publish the docs. That way there would be no logs in the driver in the source tree, and no changes to promote when you build it locally.

@Julow
Copy link
Collaborator Author

Julow commented Jul 5, 2023

and only turn the logging on for when we're building in order to publish the docs.

I don't like that the publish case is the one with an error status but this sounds like a good compromise.

@panglesd
Copy link
Collaborator

Would it be possible (and not too complicated) to extend the current PR to also record the size of the produced files?

I think it would be an interesting metric!

@Julow
Copy link
Collaborator Author

Julow commented Sep 5, 2023

I removed the list of executed command but kept the code for printing it. It can be printed back by un-commenting a line.
I implemented the file size metrics.

Copy link
Collaborator

@panglesd panglesd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that current-bench requires a Makefile as it runs make bench to run the benchmark.

dune Outdated
(rule
(alias bench)
(action
(diff driver-benchmarks.json doc/driver-benchmarks.json)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates an error when driver-benchmarks.json is not already present at the root of the project (which is the case in the repo)... maybe generate an empty one if there is none?

@panglesd
Copy link
Collaborator

panglesd commented Sep 6, 2023

When running dune build @docgen, I randomly (and more often than not) get the following error:

-]}
+]}```mdx-error
+Exception:
+Failure
+ "run ['../src/odoc/bin/main.exe' 'compile' 'odoc.mld' '-I' '.' '-o'\n     './page-odoc.odoc' '--child' 'page-\"ocamldoc_differences\"' '--child'\n     'page-\"dune\"' '--child' 'page-\"odoc_for_authors\"' '--child'\n     'page-\"interface\"' '--child' 'page-\"features\"' '--child'\n     'page-\"parent_child_spec\"' '--child' 'page-\"driver\"' '--child'\n     'page-\"interface\"' '--child' 'page-\"odoc_examples\"' '--child'\n     'page-\"odoc_document\"' '--child' 'page-\"odoc_html\"' '--child'\n     'page-\"odoc_latex\"' '--child' 'page-\"odoc_loader\"' '--child'\n     'page-\"odoc_manpage\"' '--child' 'page-\"odoc_model\"' '--child'\n     'page-\"odoc_model_desc\"' '--child' 'page-\"odoc_html_support_files\"'\n     '--child' 'page-\"odoc_odoc\"' '--child' 'page-\"odoc_xref2\"' '--child'\n     'page-\"odoc_xref_test\"' '--child' 'page-deps' '--child' 'src-source']: No such file or directory".
+```
+

If I cd into _build/defualt/doc and run the command, it works well...

Any idea where this could come from? I would say some dune rules, where something gets executed asynchronously due to wrongly computed dependency, but I cannot find such problem in your rules...
(the problem does not show up in master)

doc/dune Outdated Show resolved Hide resolved
Copy link
Collaborator

@panglesd panglesd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really cool! Can't wait to see the graphs ^^

Almost ready to merge, remains to:

  • add the dependency in the dune file for the bench rule
  • add make bench, and make it output the json on standard output!
  • Use Unix.gettimeofday or at least understand why there is such a difference...

What would you think about plotting the compile and link values in the same graph? That is, one graph containing both compile and link values, per {size; time; number of files}. It reduces the number of graphs, and lets you compare compile and link more easily.

To plot multiple values in the same graph, you can use common prefix with a slash. For instance, in the example linked above the time taken for typing, parsing, etc., are shown in the same graph because of the ocaml/ prefix.

Also, what about plotting the total space/time used by compile and by link?


About the removal of the list of executed commands, I don't have a strong opinion. I find them sometimes useful when debugging, but it's very annoying that their output depends on the machine. It makes the file much bigger, and generate huge diffs.
I'm sure there is a better solution than a commenting/uncommenting a line (maybe output them in another file that is ignored by git?).
However, I think it is already an improvement, and I want the rest of the PR merged, so I'm fine with it!

$ '../src/odoc/bin/main.exe' 'html-generate' 'int_replace_polymorphic_compare.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
$ '../src/odoc/bin/main.exe' 'html-generate' 'inlining_transforms.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
$ '../src/odoc/bin/main.exe' 'html-generate' 'inlining_decision_intf.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
$ '../src/odoc/bin/main.exe' 'html-g
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors in the computation of the size are silently ignored. Maybe output them on the standard output, so that the user knows something has happened?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm something weird happened with the locations of the comments. They seem good in the "files changed" panel but not the "discussion" one. This one is about line 791 of the new file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's why I've added an extra metric for the number of file measured, that a human can use to asses that changes in the size metrics are meaningful.
I'm not sure we need to do much more here as this error is totally unexpected. This filtering code is mostly intended to filter-out commands that don't have a recorded output-file.

doc/driver.mld Outdated Show resolved Hide resolved
doc/driver.mld Outdated Show resolved Hide resolved
@Julow
Copy link
Collaborator Author

Julow commented Sep 26, 2023

I think this is ready for an other review. I've implemented make bench and the other problems you found, rebased and cleaned the history.

@panglesd
Copy link
Collaborator

It seems that mdx is missing from some dependency list (at least, ocaml-benchmarks ci fails due to mdx missing).

@panglesd
Copy link
Collaborator

panglesd commented Sep 29, 2023

EDIT: This has to do with my switch being 4.04 (I was testing compat code).
In 5.0, it seems to work much better!

I don't know how we could improve the situation for old switches running make bench?

Old OCaml issue:

I still have some issues:

$ make bench    # on odoc's root
Error: No rule found for doc/driver-benchmarks.json
-> required by alias bench in dune:12
make: [Makefile:21: bench] Error 1 (ignored)
dune: FILE… arguments: no 'driver-benchmarks.json' file or directory
Usage: dune promote [OPTION]… [FILE]…
Try 'dune promote --help' or 'dune --help' for more information.
make: *** [Makefile:22: bench] Error 1
$ dune build @bench
Error: No rule found for doc/driver-benchmarks.json
-> required by alias bench in dune:12
$ touch doc/driver-benchmarks.json
$ make bench                      
dune: FILE… arguments: no 'driver-benchmarks.json' file or directory
Usage: dune promote [OPTION]… [FILE]…
Try 'dune promote --help' or 'dune --help' for more information.
make: *** [Makefile:22: bench] Error 1
$ touch driver-benchmarks.json
$ make bench                  
Warning: Nothing to promote for driver-benchmarks.json.
$ cat driver-benchmarks.json 
$

Edit: trying some blibli:

$ echo blibli > driver-benchmarks.json 
$ make bench                          
File "driver-benchmarks.json", line 1, characters 0-0:
diff --git a/_build/default/driver-benchmarks.json b/_build/default/doc/driver-benchmarks.json
index d34f12f9b..e69de29bb 100644
--- a/_build/default/driver-benchmarks.json
+++ b/_build/default/doc/driver-benchmarks.json
@@ -1 +0,0 @@
-blibli
make: [Makefile:21: bench] Error 1 (ignored)
Warning: Nothing to promote for driver-benchmarks.json.
blibli

Copy link
Collaborator

@panglesd panglesd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I still have the missing "driver-benchmarks.json" error. I think we should use cat doc/driver-benchmark.json instead of diffing a file with a non-existent file
  • Running make bench twice without modifications use the cache and so produces the same result. Can be a problem if one wants to make some mean, or rerun the benchmark with less program running in his machine, ...
  • dune build @docgen is not deterministic anymore, making an infinite loop of dune build @docgen, dune promote. This diff will be noisy when working on the driver.

doc/driver.mld Outdated Show resolved Hide resolved
@panglesd
Copy link
Collaborator

I opened Julow#3 to fix the comments above.

@Julow
Copy link
Collaborator Author

Julow commented Sep 29, 2023

Thanks for the help! The rule is much better like that.
I also added odoc-bench.opam, which is used by the benchmarks CI but doesn't describe an existing package.

odoc-bench.opam Outdated Show resolved Hide resolved
Julow and others added 14 commits September 29, 2023 14:52
The list is not deterministic. It can easily be shown again if needed by
un-commenting a line.
This reflects the dependency change but also changes in the updated
dependencies.

Apparently, the `sort` tool doesn't sort the same way on different
platforms.
The result is written into a current-bench's JSON format and promoted by

    dune build @bench

Co-authored-by: panglesd <[email protected]>
The output files of compile and link commands are recorded and their
size is measured.
Which gives more reliable results.

Co-authored-by: panglesd <[email protected]>
These commands can be reproduced with instrumentation enabled to debug
performance problems.

Co-authored-by: Paul-Elliot <[email protected]>
The results are stored in `_build/default/doc/landmarks` with unique
names.

Co-authored-by: Paul-Elliot <[email protected]>
This aborts some obviously broken metric collections. This situation
happens when a dependency is missing or if only some blocks have failed.
This file has special meaning for current-bench. It does not define a
package that can be installed or released.
@Julow
Copy link
Collaborator Author

Julow commented Sep 29, 2023

I'm satisfied by the current state of this PR and I would merge if someone else agrees.

@panglesd
Copy link
Collaborator

panglesd commented Sep 29, 2023

I'm happy with the current state as well 🙂 @jonludlam do you want to have a look?

Do you know why ocaml-benchmarks says

No data for selected interval.

?

EDIT: it now works!

Co-authored-by: panglesd <[email protected]>
@Julow Julow merged commit da24899 into ocaml:master Oct 2, 2023
2 of 3 checks passed
@Julow Julow mentioned this pull request Oct 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants