driver: Record execution time of every commands #978

Julow · 2023-06-26T17:33:27Z

Record the time taken by every commands and save the result in current-bench's JSON format.

The running time of every sub-commands are recorded (min, max, avg) and the total number of times they are called.

Most of the diffs are due to more recent dependencies, the code computing the metrics is at the bottom.

The benchmark results look like this: (but all on one line)

{
    "name": "odoc",
    "results": [
        {
            "name": "driver.mld",
            "metrics": [
                {
                    "name": "total-compile",
                    "value": 523,
                    "description": "Number of time 'odoc compile' has run."
                },
                {
                    "name": "time-compile",
                    "value": {
                        "min": 0.00010799999999988596,
                        "max": 0.00363099999999994,
                        "avg": 0.0001435755258126284
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc compile'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-compile-deps",
                    "value": 492,
                    "description": "Number of time 'odoc compile-deps' has run."
                },
                {
                    "name": "time-compile-deps",
                    "value": {
                        "min": 0.00012300000000009526,
                        "max": 0.0014290000000003467,
                        "avg": 0.00019406300813008646
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc compile-deps'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-link",
                    "value": 524,
                    "description": "Number of time 'odoc link' has run."
                },
                {
                    "name": "time-link",
                    "value": {
                        "min": 0.00010399999999943788,
                        "max": 0.003881999999999941,
                        "avg": 0.0002171736641221395
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc link'",
                    "trend": "lower-is-better"
                },
                {
                    "name": "total-html-generate",
                    "value": 524,
                    "description": "Number of time 'odoc html-generate' has run."
                },
                {
                    "name": "time-html-generate",
                    "value": {
                        "min": 0.00010300000000018628,
                        "max": 0.003909000000000162,
                        "avg": 0.00021438167938932483
                    },
                    "units": "s",
                    "description": "Time taken by 'odoc html-generate'",
                    "trend": "lower-is-better"
                }
            ]
        }
    ]
}

Problems:

Are the times correct ? Summing up all the numbers give a total running time of 0.5s while it takes 26s to run @docgen on my machine.
I know Mdx is slow to run bytecode and process the output of top-level blocks, but that's extreme.
dune build @bench fails on Dune 3.7.1 with this message, even though the documentation says that <file1> doesn't have to exist:

Error: Unable to resolve symlink _build/default/driver-benchmarks.json

jonludlam · 2023-06-28T17:40:35Z

The original reason for adding in the extra libraries for the benchmark was that core/core_kernel is considerably more complex than base, and therefore provided some more coverage of odoc's code. I'm a bit reluctant to remove it without a solid reason for doing so.

jonludlam · 2023-06-28T17:54:19Z

There are a few other annoyances associated with the driver as-is, not things you're introducing in this commit, obviously:

we capture the details of the opam switch in the output. We could possibly do a substitution in the logs and put in $(opam config var lib) where we've got the opam lib dir? Though we'd need to make the paths absolute rather than relative for that to work.
we should pin the exact versions of the libraries being used for the driver, which would help the output be reproducable. This will be a bit weird since we're not making these constraints part of the opam file, but for keeping the docs site consistent it'd be useful.

Julow · 2023-06-29T08:17:16Z

I don't believe that the quality would increase if it runs for longer. It would have a huge input and output, I'm worried that we are only measuring the file system caches.
The main goal of this benchmark is to passively measure the evolution in performances on the long term. I'm worried about wasted resources.

To increase the quality of the result, we could perhaps do a warmup run ? This is not reasonable with a very long benchmark.

I don't think coverage is important when everything is averaged, especially if the atypical code is so much bigger. Perhaps we would make a different benchmark for Core alone ? So that we measure only the atypical code.

jonludlam · 2023-06-30T14:44:41Z

It's not to do with having longer runs, it's to do with the complexity of core vs base. The point of the benchmark is precisely to measure the atypical code, since it's pretty easy for regressions to creep in there unnoticed. Some early versions of odoc ran quite quickly on base and took many minutes to run on core_kernel - and I do actually run this benchmark test whenever I'm working on the performance sensitive bits odoc.

Julow · 2023-07-03T15:37:07Z

I did not realize it was used, I don't want to remove it then. Reverted.

Julow · 2023-07-03T15:42:07Z

we should pin the exact versions of the libraries being used for the driver, which would help the output be reproducable. This will be a bit weird since we're not making these constraints part of the opam file, but for keeping the docs site consistent it'd be useful.

To avoid the maintenance cost, I suggest to remove the logs of commands.
It could be replaced by logging only commands affecting Odoc's modules, with options such as the search path removed.

jonludlam · 2023-07-03T15:50:44Z

We're using this driver for a few different use cases now - benchmarking / testing / creating the odoc website. I quite like having the concrete commands recorded for the purposes of the website, but it definitely gets in the way of day-to-day testing. We've got the env var turning on/off the extra dependencies - we could perhaps use the same mechanism to turn the logging of the commands on and off. E.g. ODOC_DRIVER_MODE could be BENCH or TEST or DOC or something?

Julow · 2023-07-03T16:47:11Z

I'm not a fan of making the output even more non deterministic. The list of commands is huge. Do you expect anyone to use it ?

Ideally, we'd have a different instance for different purposes. One for current-bench, one for building core and one for the doc.
We could do that with Mdx file includes, which independently from the pro and cons, would allow using the same functions from different md files.

jonludlam · 2023-07-04T08:18:31Z

The list of commands is huge. Do you expect anyone to use it ?

Well, yes. I used it myself it a few weeks back when doing the voodoo prototype for source rendering. It's way easier to read the actual shell invocations of odoc than to simply read the function that assembles the command line.

We could do that with Mdx file includes, which independently from the pro and cons, would allow using the same functions from different md files.

The thing that makes this a struggle is that the main use of this code is to be very clear about how to use odoc, so I'm not keen on splitting it up into multiple files - it makes for more tricky reading. Adding some straightforward logic to log or not won't make the code any harder to read.

I'm not a fan of making the output even more non deterministic.

Surely this makes is more deterministic!?

Julow · 2023-07-04T08:45:15Z

so I'm not keen on splitting it up into multiple files

Mdx have a way to include code from different parts of one .ml file (with delimiting comments). The documentation would stay the same.

Would you mind if I try ?

Adding some straightforward logic to log or not won't make the code any harder to read.

I slightly disagree. Adding any amount of code makes it harder to read.

Surely this makes is more deterministic!?

Sure but that doesn't solve any problem. There will always be a huge diff, the exit status will always be an error.

jonludlam · 2023-07-04T15:00:39Z

Sure but that doesn't solve any problem. There will always be a huge diff, the exit status will always be an error.

Ah, what I meant was that we would turn everything off for the common case of testing, and only turn the logging on for when we're building in order to publish the docs. That way there would be no logs in the driver in the source tree, and no changes to promote when you build it locally.

Julow · 2023-07-05T07:34:20Z

and only turn the logging on for when we're building in order to publish the docs.

I don't like that the publish case is the one with an error status but this sounds like a good compromise.

panglesd · 2023-07-17T08:55:59Z

Would it be possible (and not too complicated) to extend the current PR to also record the size of the produced files?

I think it would be an interesting metric!

Julow · 2023-09-05T15:41:59Z

I removed the list of executed command but kept the code for printing it. It can be printed back by un-commenting a line.
I implemented the file size metrics.

panglesd

I think that current-bench requires a Makefile as it runs make bench to run the benchmark.

panglesd · 2023-09-06T13:39:32Z

dune

+(rule
+ (alias bench)
+ (action
+  (diff driver-benchmarks.json doc/driver-benchmarks.json)))


This creates an error when driver-benchmarks.json is not already present at the root of the project (which is the case in the repo)... maybe generate an empty one if there is none?

panglesd · 2023-09-06T13:56:37Z

When running dune build @docgen, I randomly (and more often than not) get the following error:

-]}
+]}```mdx-error
+Exception:
+Failure
+ "run ['../src/odoc/bin/main.exe' 'compile' 'odoc.mld' '-I' '.' '-o'\n     './page-odoc.odoc' '--child' 'page-\"ocamldoc_differences\"' '--child'\n     'page-\"dune\"' '--child' 'page-\"odoc_for_authors\"' '--child'\n     'page-\"interface\"' '--child' 'page-\"features\"' '--child'\n     'page-\"parent_child_spec\"' '--child' 'page-\"driver\"' '--child'\n     'page-\"interface\"' '--child' 'page-\"odoc_examples\"' '--child'\n     'page-\"odoc_document\"' '--child' 'page-\"odoc_html\"' '--child'\n     'page-\"odoc_latex\"' '--child' 'page-\"odoc_loader\"' '--child'\n     'page-\"odoc_manpage\"' '--child' 'page-\"odoc_model\"' '--child'\n     'page-\"odoc_model_desc\"' '--child' 'page-\"odoc_html_support_files\"'\n     '--child' 'page-\"odoc_odoc\"' '--child' 'page-\"odoc_xref2\"' '--child'\n     'page-\"odoc_xref_test\"' '--child' 'page-deps' '--child' 'src-source']: No such file or directory".
+```
+

If I cd into _build/defualt/doc and run the command, it works well...

Any idea where this could come from? I would say some dune rules, where something gets executed asynchronously due to wrongly computed dependency, but I cannot find such problem in your rules...
(the problem does not show up in master)

doc/dune

panglesd

Looks really cool! Can't wait to see the graphs ^^

Almost ready to merge, remains to:

add the dependency in the dune file for the bench rule
add make bench, and make it output the json on standard output!
Use Unix.gettimeofday or at least understand why there is such a difference...

What would you think about plotting the compile and link values in the same graph? That is, one graph containing both compile and link values, per {size; time; number of files}. It reduces the number of graphs, and lets you compare compile and link more easily.

To plot multiple values in the same graph, you can use common prefix with a slash. For instance, in the example linked above the time taken for typing, parsing, etc., are shown in the same graph because of the ocaml/ prefix.

Also, what about plotting the total space/time used by compile and by link?

About the removal of the list of executed commands, I don't have a strong opinion. I find them sometimes useful when debugging, but it's very annoying that their output depends on the machine. It makes the file much bigger, and generate huge diffs.
I'm sure there is a better solution than a commenting/uncommenting a line (maybe output them in another file that is ignored by git?).
However, I think it is already an improvement, and I want the rest of the PR merged, so I'm fine with it!

panglesd · 2023-09-06T15:59:32Z

doc/driver.mld

-$ '../src/odoc/bin/main.exe' 'html-generate' 'int_replace_polymorphic_compare.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
-$ '../src/odoc/bin/main.exe' 'html-generate' 'inlining_transforms.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
-$ '../src/odoc/bin/main.exe' 'html-generate' 'inlining_decision_intf.odocl' '-o' 'html' '--theme-uri' 'odoc' '--support-uri' 'odoc'
-$ '../src/odoc/bin/main.exe' 'html-g


Errors in the computation of the size are silently ignored. Maybe output them on the standard output, so that the user knows something has happened?

Hmm something weird happened with the locations of the comments. They seem good in the "files changed" panel but not the "discussion" one. This one is about line 791 of the new file.

That's why I've added an extra metric for the number of file measured, that a human can use to asses that changes in the size metrics are meaningful.
I'm not sure we need to do much more here as this error is totally unexpected. This filtering code is mostly intended to filter-out commands that don't have a recorded output-file.

doc/driver.mld

Julow · 2023-09-26T15:09:00Z

I think this is ready for an other review. I've implemented make bench and the other problems you found, rebased and cleaned the history.

panglesd · 2023-09-26T15:35:29Z

It seems that mdx is missing from some dependency list (at least, ocaml-benchmarks ci fails due to mdx missing).

panglesd · 2023-09-29T08:17:30Z

EDIT: This has to do with my switch being 4.04 (I was testing compat code).
In 5.0, it seems to work much better!

I don't know how we could improve the situation for old switches running make bench?

Old OCaml issue:

I still have some issues:

$ make bench    # on odoc's root
Error: No rule found for doc/driver-benchmarks.json
-> required by alias bench in dune:12
make: [Makefile:21: bench] Error 1 (ignored)
dune: FILE… arguments: no 'driver-benchmarks.json' file or directory
Usage: dune promote [OPTION]… [FILE]…
Try 'dune promote --help' or 'dune --help' for more information.
make: *** [Makefile:22: bench] Error 1
$ dune build @bench
Error: No rule found for doc/driver-benchmarks.json
-> required by alias bench in dune:12
$ touch doc/driver-benchmarks.json
$ make bench                      
dune: FILE… arguments: no 'driver-benchmarks.json' file or directory
Usage: dune promote [OPTION]… [FILE]…
Try 'dune promote --help' or 'dune --help' for more information.
make: *** [Makefile:22: bench] Error 1
$ touch driver-benchmarks.json
$ make bench                  
Warning: Nothing to promote for driver-benchmarks.json.
$ cat driver-benchmarks.json 
$

Edit: trying some blibli:

$ echo blibli > driver-benchmarks.json 
$ make bench                          
File "driver-benchmarks.json", line 1, characters 0-0:
diff --git a/_build/default/driver-benchmarks.json b/_build/default/doc/driver-benchmarks.json
index d34f12f9b..e69de29bb 100644
--- a/_build/default/driver-benchmarks.json
+++ b/_build/default/doc/driver-benchmarks.json
@@ -1 +0,0 @@
-blibli
make: [Makefile:21: bench] Error 1 (ignored)
Warning: Nothing to promote for driver-benchmarks.json.
blibli

panglesd

I still have the missing "driver-benchmarks.json" error. I think we should use cat doc/driver-benchmark.json instead of diffing a file with a non-existent file
Running make bench twice without modifications use the cache and so produces the same result. Can be a problem if one wants to make some mean, or rerun the benchmark with less program running in his machine, ...
dune build @docgen is not deterministic anymore, making an infinite loop of dune build @docgen, dune promote. This diff will be noisy when working on the driver.

doc/driver.mld

panglesd · 2023-09-29T10:55:25Z

I opened Julow#3 to fix the comments above.

Julow · 2023-09-29T12:45:40Z

Thanks for the help! The rule is much better like that.
I also added odoc-bench.opam, which is used by the benchmarks CI but doesn't describe an existing package.

odoc-bench.opam

The list is not deterministic. It can easily be shown again if needed by un-commenting a line.

This reflects the dependency change but also changes in the updated dependencies. Apparently, the `sort` tool doesn't sort the same way on different platforms.

@bench

The result is written into a current-bench's JSON format and promoted by dune build @bench Co-authored-by: panglesd <[email protected]>

The output files of compile and link commands are recorded and their size is measured.

Which gives more reliable results. Co-authored-by: panglesd <[email protected]>

These commands can be reproduced with instrumentation enabled to debug performance problems. Co-authored-by: Paul-Elliot <[email protected]>

The results are stored in `_build/default/doc/landmarks` with unique names. Co-authored-by: Paul-Elliot <[email protected]>

This aborts some obviously broken metric collections. This situation happens when a dependency is missing or if only some blocks have failed.

Co-authored-by: Paul-Elliot <[email protected]>

Signed-off-by: Paul-Elliot <[email protected]>

This file has special meaning for current-bench. It does not define a package that can be installed or released.

Julow · 2023-09-29T12:55:26Z

I'm satisfied by the current state of this PR and I would merge if someone else agrees.

panglesd · 2023-09-29T13:04:25Z

I'm happy with the current state as well 🙂 @jonludlam do you want to have a look?

Do you know why ocaml-benchmarks says

No data for selected interval.

?

EDIT: it now works!

Co-authored-by: panglesd <[email protected]>

Julow force-pushed the oci-bench branch from 845c9d2 to 868abc9 Compare July 3, 2023 15:36

Julow force-pushed the oci-bench branch from 868abc9 to 88cbbc9 Compare September 5, 2023 15:40

panglesd reviewed Sep 6, 2023

View reviewed changes

doc/dune Outdated Show resolved Hide resolved

panglesd approved these changes Sep 6, 2023

View reviewed changes

panglesd reviewed Sep 6, 2023

View reviewed changes

doc/driver.mld Outdated Show resolved Hide resolved

Julow force-pushed the oci-bench branch from 6df2ac9 to 3c24db4 Compare September 12, 2023 13:01

Julow force-pushed the oci-bench branch from 3c24db4 to 12d9624 Compare September 26, 2023 15:08

panglesd reviewed Sep 29, 2023

View reviewed changes

doc/driver.mld Outdated Show resolved Hide resolved

Julow force-pushed the oci-bench branch from c5d4cf5 to b3035c4 Compare September 29, 2023 12:44

panglesd reviewed Sep 29, 2023

View reviewed changes

odoc-bench.opam Outdated Show resolved Hide resolved

Julow and others added 14 commits September 29, 2023 14:52

driver: Update dependencies

5ae451d

driver.mld: Hide the list of executed commands

920510f

The list is not deterministic. It can easily be shown again if needed by un-commenting a line.

Promote driver.mld

95681cf

This reflects the dependency change but also changes in the updated dependencies. Apparently, the `sort` tool doesn't sort the same way on different platforms.

driver: Record execution time of every commands

71c857d

The result is written into a current-bench's JSON format and promoted by dune build @bench Co-authored-by: panglesd <[email protected]>

driver.mld: Measure size of produced files

c2151b6

The output files of compile and link commands are recorded and their size is measured.

Use Unix.gettimeofday instead of Sys.time

bdd1e46

Which gives more reliable results. Co-authored-by: panglesd <[email protected]>

driver.mld: Print the slowest commands

1f138fd

These commands can be reproduced with instrumentation enabled to debug performance problems. Co-authored-by: Paul-Elliot <[email protected]>

driver.mld: Use a record for executed commands

9cd54f3

driver.mld: Increase #print_length

c9b2eee

driver.mld: Optionally instrument with landmarks

bcfd931

The results are stored in `_build/default/doc/landmarks` with unique names. Co-authored-by: Paul-Elliot <[email protected]>

driver.mld: Fail metrics collection if a command hasn't run

45e3d39

This aborts some obviously broken metric collections. This situation happens when a dependency is missing or if only some blocks have failed.

'make bench' target that prints result to stdout

1f0ee21

Co-authored-by: Paul-Elliot <[email protected]>

Make multiple runs of benchmark actually run.

efd93be

Signed-off-by: Paul-Elliot <[email protected]>

Define benchmark dependencies in odoc-bench.opam

3b167a0

This file has special meaning for current-bench. It does not define a package that can be installed or released.

Julow force-pushed the oci-bench branch from b3035c4 to 3b167a0 Compare September 29, 2023 12:52

Update odoc-bench.opam

b5ce93c

Co-authored-by: panglesd <[email protected]>

Julow merged commit da24899 into ocaml:master Oct 2, 2023
2 of 3 checks passed

Julow mentioned this pull request Oct 31, 2023

Improve benchmarks #1034

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

driver: Record execution time of every commands #978

driver: Record execution time of every commands #978

Julow commented Jun 26, 2023

jonludlam commented Jun 28, 2023

jonludlam commented Jun 28, 2023

Julow commented Jun 29, 2023

jonludlam commented Jun 30, 2023

Julow commented Jul 3, 2023

Julow commented Jul 3, 2023

jonludlam commented Jul 3, 2023

Julow commented Jul 3, 2023

jonludlam commented Jul 4, 2023

Julow commented Jul 4, 2023

jonludlam commented Jul 4, 2023

Julow commented Jul 5, 2023

panglesd commented Jul 17, 2023

Julow commented Sep 5, 2023

panglesd left a comment

panglesd Sep 6, 2023

panglesd commented Sep 6, 2023

panglesd left a comment •

edited

Loading

panglesd Sep 6, 2023

panglesd Sep 6, 2023

Julow Sep 12, 2023

Julow commented Sep 26, 2023

panglesd commented Sep 26, 2023

panglesd commented Sep 29, 2023 •

edited

Loading

panglesd left a comment

panglesd commented Sep 29, 2023

Julow commented Sep 29, 2023

Julow commented Sep 29, 2023

panglesd commented Sep 29, 2023 •

edited

Loading

driver: Record execution time of every commands #978

driver: Record execution time of every commands #978

Conversation

Julow commented Jun 26, 2023

jonludlam commented Jun 28, 2023

jonludlam commented Jun 28, 2023

Julow commented Jun 29, 2023

jonludlam commented Jun 30, 2023

Julow commented Jul 3, 2023

Julow commented Jul 3, 2023

jonludlam commented Jul 3, 2023

Julow commented Jul 3, 2023

jonludlam commented Jul 4, 2023

Julow commented Jul 4, 2023

jonludlam commented Jul 4, 2023

Julow commented Jul 5, 2023

panglesd commented Jul 17, 2023

Julow commented Sep 5, 2023

panglesd left a comment

Choose a reason for hiding this comment

panglesd Sep 6, 2023

Choose a reason for hiding this comment

panglesd commented Sep 6, 2023

panglesd left a comment • edited Loading

Choose a reason for hiding this comment

panglesd Sep 6, 2023

Choose a reason for hiding this comment

panglesd Sep 6, 2023

Choose a reason for hiding this comment

Julow Sep 12, 2023

Choose a reason for hiding this comment

Julow commented Sep 26, 2023

panglesd commented Sep 26, 2023

panglesd commented Sep 29, 2023 • edited Loading

panglesd left a comment

Choose a reason for hiding this comment

panglesd commented Sep 29, 2023

Julow commented Sep 29, 2023

Julow commented Sep 29, 2023

panglesd commented Sep 29, 2023 • edited Loading

panglesd left a comment •

edited

Loading

panglesd commented Sep 29, 2023 •

edited

Loading

panglesd commented Sep 29, 2023 •

edited

Loading