Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 delta is slow (noticeable lag) compared to default diff (11ms -> 76ms) in VSCode #1897

Open
certik opened this issue Nov 11, 2024 · 28 comments

Comments

@certik
Copy link

certik commented Nov 11, 2024

I am using Ubuntu (WSL on Windows). The default git diff is immediate, while delta has a noticeable lag. As a rule of thumb, anything under 30ms feels immediate, but over 30ms you will notice a lag.

I don't know if the issue is with delta itself, or if this extra 65ms overhead is caused by launching the delta program from git, but unfortunately that makes it too slow for me, since I want these command line tools to feel immediate, the extra lag distracts me.

In VSCode console, default git diff:

$ time git diff
diff --git a/src/lfortran/parser/preprocessor.re b/src/lfortran/parser/preprocessor.re
index 1f39b3c58..67643a220 100644
--- a/src/lfortran/parser/preprocessor.re
+++ b/src/lfortran/parser/preprocessor.re
@@ -433,7 +433,7 @@ std::string CPreprocessor::run(const std::string &input, LocationManager &lm,
                 interval_end_type_0(lm, output.size(), cur-string_start);
                 continue;
             }
-            "#" whitespace? "include" whitespace '"' @t1 [^"\x00]* @t2 '"' [^\n\x00]* newline {
+            "#" whitespace? "include" whitespace ('"' | '<') @t1 [^"\x00]* @t2 ('"' | '>') [^\n\x00]* newline {
                 if (!branch_enabled) continue;
                 std::string filename = token(t1, t2);
                 std::vector<std::filesystem::path> include_dirs;

real    0m0.011s
user    0m0.013s
sys     0m0.000s

And delta:

$ time git diff

Δ src/lfortran/parser/preprocessor.re
─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

─────────────────────────────────────────────────────────────────────────────────────┐
• 433: std::string CPreprocessor::run(const std::string &input, LocationManager &lm, │
─────────────────────────────────────────────────────────────────────────────────────┘
                interval_end_type_0(lm, output.size(), cur-string_start);
                continue;
            }
            "#" whitespace? "include" whitespace '"' @t1 [^"\x00]* @t2 '"' [^\n\x00]* newline {
            "#" whitespace? "include" whitespace ('"' | '<') @t1 [^"\x00]* @t2 ('"' | '>') [^\n\x00]* newline {
                if (!branch_enabled) continue;
                std::string filename = token(t1, t2);
                std::vector<std::filesystem::path> include_dirs;

real    0m0.076s
user    0m0.034s
sys     0m0.004s

I also tried the same thing directly in Terminal, and there I get 8ms with git diff and 27ms with delta, which is still a huge overhead, but fortunately it is under 30ms, and so it feels immediate.

So there are two separate issues:

  • large overhead of delta compared to native git diff (3x slower in Terminal, 6x slower in VSCode)
  • VSCode seems tiny bit slower for git diff, but a lot slower for delta

If you have any tips that I could try, let me know, I am happy to help debug.

@dandavison
Copy link
Owner

Hi @certik,

It could well be time to do some profiling and optimization. I haven't profiled properly but from very quick ad-hoc experimenting it looks like I'm getting results that are similar to yours in some ways:

This is on a MacOS M2, testing a one-line diff via git diff like you:

Without Delta With Delta
Alacritty ~20ms ~35-40ms
VSCode ~20ms ~65-70ms

So at minimum, it looks like we have one concrete question: why is delta slower in VSCode? Perhaps simply because delta emits more ANSI escape sequences and the VSCode terminal isn't performing as well on them as other terminal emulators?

@certik
Copy link
Author

certik commented Nov 11, 2024

@dandavison awesome, I am glad you can reproduce it on macOS also.

why is delta slower in VSCode? Perhaps simply because delta emits more ANSI escape sequences and the VSCode terminal isn't performing as well on them as other terminal emulators?

Let's figure it out: we can test this hypothesis by making delta emit output without ANSI escape sequences, to either confirm or rule it out. This would be useful even for Alacritty / Terminal to see if they get faster.

Note: I installed delta using conda-forge, here is the build command: https://github.com/conda-forge/git-delta-feedstock/blob/565fea2fbdf4aea4f40a88d260445be75b3b7d62/recipe/build.sh#L13, I think this builds with Rust in Release mode (enables all optimizations)?

@dandavison
Copy link
Owner

Here's another experiment. This experiment does not involve the terminal emulator rendering anything, and it doesn't involve git invoking delta; instead we invoke delta explicitly via a shell pipe. This seems to suggest that (a) delta's execution costs around 6ms and (b) much of the total execution time of delta is due to terminal emulator activity, especially in VSCode.

(You probably know this, but beware of things like git diff > /dev/null -- git will not invoke delta even if it's configured to do so.)

Alacritty

$ hyperfine --warmup 500 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):      14.2 ms ±   2.4 ms    [User: 9.5 ms, System: 7.7 ms]
  Range (min … max):    10.0 ms …  21.1 ms    117 runs

$ hyperfine --warmup 500 'git diff > /dev/null'
Benchmark 1: git diff > /dev/null
  Time (mean ± σ):       8.7 ms ±   2.0 ms    [User: 3.4 ms, System: 3.3 ms]
  Range (min … max):     5.6 ms …  14.9 ms    146 runs

VSCode

$ hyperfine --warmup 500 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):      14.9 ms ±   2.8 ms    [User: 10.3 ms, System: 7.5 ms]
  Range (min … max):    10.1 ms …  30.1 ms    109 runs
 

$ hyperfine --warmup 500 'git diff > /dev/null'
Benchmark 1: git diff > /dev/null
  Time (mean ± σ):       8.0 ms ±   3.2 ms    [User: 3.4 ms, System: 3.6 ms]
  Range (min … max):     2.8 ms …  21.6 ms    132 runs

@certik
Copy link
Author

certik commented Nov 12, 2024

@dandavison thanks for the benchmarks, this looks very promising. It looks like both in VSCode and Alacritty git diff takes 8ms, and delta takes additional 6ms or so, so 14ms total. I guess it depends on the diff size too, but this is completely usable.

Somehow there is a large penalty that happens after delta is done, but it's weird. Terminals in my experience are slow compared to how fast they could be, but they are not that slow: for small output like I did it should not take 20 - 50ms to render.

In WSL + Terminal on another (slower) laptop, on a larger (several pages) diff:

$ hyperfine --warmup 50 'git diff > /dev/null'
Benchmark 1: git diff > /dev/null
  Time (mean ± σ):      16.4 ms ±   3.2 ms    [User: 8.1 ms, System: 10.4 ms]
  Range (min … max):    11.1 ms …  28.9 ms    165 runs

$ hyperfine --warmup 50 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):     200.7 ms ±  20.6 ms    [User: 57.1 ms, System: 28.6 ms]
  Range (min … max):   177.5 ms … 237.6 ms    16 runs

I then did the simplest one line diff:

$ hyperfine --warmup 50 'git diff > /dev/null'
Benchmark 1: git diff > /dev/null
  Time (mean ± σ):      12.9 ms ±   2.4 ms    [User: 5.7 ms, System: 9.2 ms]
  Range (min … max):     8.5 ms …  23.4 ms    168 runs

$ hyperfine --warmup 50 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):     180.6 ms ±  12.3 ms    [User: 25.7 ms, System: 21.6 ms]
  Range (min … max):   162.9 ms … 205.7 ms    17 runs

We can then time everything via:

$ time git diff
diff --git a/build1.sh b/build1.sh
index 65c7ea56c..5ae61dbd2 100755
--- a/build1.sh
+++ b/build1.sh
@@ -14,4 +14,4 @@ cmake \
     -DCMAKE_INSTALL_PREFIX=`pwd`/inst \
     -DCMAKE_INSTALL_LIBDIR=share/lfortran/lib \
     .
-cmake --build . -j16 --target install
+#cmake --build . -j16 --target install

real    0m0.015s
user    0m0.008s
sys     0m0.009s
$ time git diff | delta

build1.sh
──────────────────────────────────────────────────────────────────────────

────────────┐
14: cmake \ │
────────────┘
    -DCMAKE_INSTALL_PREFIX=`pwd`/inst \
    -DCMAKE_INSTALL_LIBDIR=share/lfortran/lib \
    .
cmake --build . -j16 --target install
#cmake --build . -j16 --target install

real    0m0.203s
user    0m0.034s
sys     0m0.024s

I ran it couple times and took the smallest number. It seems git diff takes 15ms, delta takes 160ms and the terminal takes 20ms, or so. I also tried VSCode, but I am getting similar numbers there.

@dandavison
Copy link
Owner

I then did the simplest one line diff:

Your results seem to be showing delta taking ~185ms to compute a multi-page diff and then ~170ms to compute a one-line diff. That can't be right? Or is that laptop just very slow to start delta?

seems git diff takes 15ms, delta takes 160m

Can you post the diff which delta a long time on?

So, in summary do you think there's any delta development work indicated here? Or can we conclude that delta is fast enough, but some terminal emulators are slow at rendering its output?

@certik
Copy link
Author

certik commented Nov 12, 2024

Let's dig deeper to answer your questions. I am using https://github.com/lfortran/lfortran/ in WSL on a Surface 5 laptop. Here is the git diff:

$ time git diff
diff --git a/README.md b/README.md
index 166c893ba..34f093214 100644
--- a/README.md
+++ b/README.md
@@ -4,6 +4,7 @@

 LFortran is a modern open-source (BSD licensed) interactive Fortran compiler
 built on top of LLVM. It can execute user's code interactively to allow
+
 exploratory work (much like Python, MATLAB or Julia) as well as compile to
 binaries with the goal to run user's code on modern architectures such as
 multi-core CPUs and GPUs.

real    0m0.016s
user    0m0.004s
sys     0m0.013s

Here is delta with various options:

$ time git diff | delta

README.md
──────────────────────────────────────────────────────────────────────────

───┐
4: │
───┘

LFortran is a modern open-source (BSD licensed) interactive Fortran compiler
built on top of LLVM. It can execute user's code interactively to allow

exploratory work (much like Python, MATLAB or Julia) as well as compile to
binaries with the goal to run user's code on modern architectures such as
multi-core CPUs and GPUs.

real    0m0.247s
user    0m0.034s
sys     0m0.024s

I ran it couple times. I then ran:

$ time git diff | delta --color-only
diff --git a/README.md b/README.md
index 166c893ba..34f093214 100644
--- a/README.md
+++ b/README.md
@@ -4,6 +4,7 @@

 LFortran is a modern open-source (BSD licensed) interactive Fortran compiler
 built on top of LLVM. It can execute user's code interactively to allow
+
 exploratory work (much like Python, MATLAB or Julia) as well as compile to
 binaries with the goal to run user's code on modern architectures such as
 multi-core CPUs and GPUs.

real    0m0.045s
user    0m0.052s
sys     0m0.007s

That's much better!

I wonder if Windows takes forever to launch the program for some reason?

I ran it many times by hand, the fastest I was able to get delta is 39ms (git diff is 16ms, so delta takes 23ms). Unfortunately that is still too slow, it must get below 30ms.

To go from here, I could write a simple prototype that just colors the output a bit, and see if it can run faster. I would expect 6ms the most for simple diffs like that, not 23ms.

@dandavison
Copy link
Owner

dandavison commented Nov 12, 2024

I wonder if Windows takes forever to launch the program for some reason?

@th1000s may have thoughts for this thread.

I wonder whether it's some I/O that's being done at start up time. Can you try with --no-gitconfig? I believe that will prevent any attempt to read gitconfig files from disk.

Is it worth investigating whether not doing the calling process detection in https://github.com/dandavison/delta/blob/main/src/utils/process.rs#L67-L78 changes timings? @certik are you able to modify the Rust code and try things like that out?

@certik
Copy link
Author

certik commented Nov 12, 2024

I haven't seen a difference with --no-gitconfig.

The first time it runs it's always ~200ms, then it runs faster, but only sometimes. For example hyperfine is still slow:

$ hyperfine --warmup 50 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):     211.2 ms ±  31.1 ms    [User: 30.4 ms, System: 34.5 ms]
  Range (min … max):   160.3 ms … 278.1 ms    15 runs

$ time git diff | delta > /dev/null

real    0m0.040s
user    0m0.043s
sys     0m0.019s

Who knows what's causing this. However, other programs run fast, such as:

$ hyperfine --warmup 50 'git diff | cat > /dev/null'
Benchmark 1: git diff | cat > /dev/null
  Time (mean ± σ):      14.4 ms ±   2.6 ms    [User: 8.8 ms, System: 10.7 ms]
  Range (min … max):     7.2 ms …  23.5 ms    179 runs

I'll try to create a minimal C++ or Rust program for diff processing and see if it runs fast, I bet it will.

Yes, I can try removing those lines in Rust, I'll do it later.

@dandavison
Copy link
Owner

I haven't seen a difference with --no-gitconfig

Hm. A testament to libgit2's good engineering I guess.

hyperfine is still slow

Could this be your shell startup time? I'd try -N.

Yes, I can try removing those lines in Rust, I'll do it later.

Thanks! I am curious about that.

I'll try to create a minimal C++ or Rust program for diff processing and see if it runs fast, I bet it will.

If you're sure it's worth it! But I'd be inclined not to spend time on that if you don't think it will help solve the problem here.

@bash
Copy link
Contributor

bash commented Nov 12, 2024

Automatic dark/light detection also contributes to startup times and depends a lot on the terminal emulator's speed in responding to queries. You can disable dark/light detection by passing either --dark or --light to delta.

@certik
Copy link
Author

certik commented Nov 12, 2024

Now I am on a desktop with Windows, which is a lot faster than my laptop. Still, unfortunately delta is too slow.

The best way to know what the ideal speed of delta should be is to have a reference implementation. Here it is: https://gist.github.com/certik/8e2270033a0bedbc2daca9b0e5ffd375

When you compile it as documented at the top of the file, let's do some benchmarking, I ran each several times and took the fastest run, in WSL Ubuntu, in a Terminal. First a single line diff:

$ time git diff > /dev/null

real    0m0.006s
user    0m0.001s
sys     0m0.008s
$ time git diff | ./mydelta > /dev/null

real    0m0.008s
user    0m0.003s
sys     0m0.007s
$ time git diff | cat > /dev/null

real    0m0.008s
user    0m0.010s
sys     0m0.001s
$ time git diff | delta > /dev/null

real    0m0.015s
user    0m0.018s
sys     0m0.001s

So mydelta is as fast as cat, about 1-2ms. That's good, that's what I would expect and hope. delta is about 7ms.

Now let's try a larger diff:

$ time git diff > /dev/null

real    0m0.027s
user    0m0.016s
sys     0m0.008s
$ time git diff | ./mydelta > /dev/null

real    0m0.029s
user    0m0.027s
sys     0m0.010s
$ time git diff | cat > /dev/null

real    0m0.028s
user    0m0.017s
sys     0m0.015s
$ time git diff | delta > /dev/null

real    0m0.151s
user    0m0.140s
sys     0m0.043s

Here mydelta is still about 2ms, while delta is about 124ms.

Hyperfine seems to mirror the above timings:

$ hyperfine --warmup 50 'git diff > /dev/null'
Benchmark 1: git diff > /dev/null
  Time (mean ± σ):      36.6 ms ±  10.5 ms    [User: 26.0 ms, System: 11.2 ms]
  Range (min … max):    19.5 ms …  62.0 ms    138 runs
$ hyperfine --warmup 50 'git diff | ./mydelta > /dev/null'
Benchmark 1: git diff | ./mydelta > /dev/null
  Time (mean ± σ):      36.6 ms ±  10.0 ms    [User: 34.0 ms, System: 11.0 ms]
  Range (min … max):    20.8 ms …  59.2 ms    67 runs
$ hyperfine --warmup 50 'git diff | cat > /dev/null'
Benchmark 1: git diff | cat > /dev/null
  Time (mean ± σ):      38.9 ms ±  10.3 ms    [User: 30.0 ms, System: 12.3 ms]
  Range (min … max):    20.4 ms …  58.3 ms    63 runs
$ hyperfine --warmup 50 'git diff | delta > /dev/null'
Benchmark 1: git diff | delta > /dev/null
  Time (mean ± σ):     178.8 ms ±  10.8 ms    [User: 193.1 ms, System: 41.3 ms]
  Range (min … max):   160.3 ms … 207.5 ms    17 runs

Finally, I also tried VSCode, and I get similar timings, that probably is not a surprise since we use /dev/null.

Now let's try an empty diff in a Terminal:

$ time git diff

real    0m0.012s
user    0m0.001s
sys     0m0.012s
$ time git diff | ./mydelta

real    0m0.013s
user    0m0.011s
sys     0m0.009s
$ time git diff | delta

real    0m0.023s
user    0m0.019s
sys     0m0.005s

and VSCode:

$ time git diff 

real    0m0.017s
user    0m0.022s
sys     0m0.000s
$ time git diff | ./mydelta 

real    0m0.016s
user    0m0.018s
sys     0m0.001s
$ time git diff | delta

real    0m0.067s
user    0m0.022s
sys     0m0.006s

Here delta is just slower even in a Terminal, and it is too slow in VSCode, for whatever reason. mydelta doesn't have any noticeable overhead, since the timings are a bit noisy.

All of the above is reproducible on my machine, I ran it many times.

From this, we can draw some conclusions:

  • It is possible to run a 3rd party program (mydelta) that has less than 2ms overhead even on large diffs, it prints colors and it works in a Terminal and VSCode. It is fast even when showing the diff to the terminal, both Terminal and VSCode.
  • The speed of mydelta is on the level of cat.
  • Both manual timing and hyperfine show similar results
  • delta is consistently slower, and in VSCode for any diff and Terminal for larger diffs it is slower than 30ms, thus showing noticeable lag

What is causing it? I don't know. Let me ask some questions:

  • Is delta reading any files? For fastest performance it should have a mode that just reformats the diff, no reading of any files
  • Is it querying the terminal or system for some capabilities? I would turn it off

@dandavison
Copy link
Owner

dandavison commented Nov 12, 2024

  • Delta queries the terminal for capabilities at startup; see @bash's post above. To disable it use dark = true or light = true.
  • Delta reads config files at startup, to disable is --no-gitconfig
  • Delta queries for calling processes in a child thread at startup, and then waits for the result later in certain code paths. The only way to disable this currently is to disable the code linked above: https://github.com/dandavison/delta/blob/main/src/utils/process.rs#L67-L78

I think that's all the I/O / potentially expensive syscalls done at startup -- @th1000s / @bash did I miss anything?

@jb55
Copy link

jb55 commented Nov 18, 2024

I came here trying to figure out why any git action is taking like 1 second... this was why:

time delta

real	0m1.050s
user	0m0.029s
sys	0m0.027s

vs

time delta --dark

real	0m0.067s
user	0m0.038s
sys	0m0.044s

@dandavison
Copy link
Owner

@jb55 what platform is that on?

@jb55
Copy link

jb55 commented Nov 18, 2024

nixos

@dandavison
Copy link
Owner

Thanks, and what's your terminal emulator (and delta --version)?

@jb55
Copy link

jb55 commented Nov 18, 2024

delta 0.18.2
rxvt-unicode (urxvt) v9.31

@bash
Copy link
Contributor

bash commented Nov 18, 2024

I came here trying to figure out why any git action is taking like 1 second... this was why:

The 1 second difference suggests to me that you're probably running into the timeout for dark/light detection. What terminal emulator are you running this on?

@th1000s
Copy link
Collaborator

th1000s commented Nov 19, 2024

With #1910 it is now possible to measure which component exactly is slow (a process opt-out is also needed). But it seems terminals which are slow to respond are the main culprit (on all my system it is plenty fast however). Once delta runs into that more than once the user could be notified, or maybe use a globally cached value.

Some example output:

$ git show
      delta timings (ms after start): tty setup: 2.3 ms, read configs: 6.0 ms, query processes: 26.1 ms, first paint: 10.1

$ git log -p
      delta timings (ms after start): tty setup: 3.7 ms, read configs: 7.9 ms, query processes: 23.2 ms, first paint: 11.2
      delta timings (ms after start): tty setup: 2.9 ms, read configs: 8.0 ms, query processes: 639.8 ms, first paint: 11.8
      # ^ parent process not requested until much later, this value is not when the query finishes


$ git blame
    delta timings (ms after start): tty setup: 2.9 ms, read configs: 8.3 ms, query processes: 13.2 ms, first paint: 12.6

@jb55
Copy link

jb55 commented Nov 19, 2024

I'm using urxvtd (daemon) with urxvtc. I wonder if that has anything to do with it? I will try #1910

@jb55
Copy link

jb55 commented Nov 19, 2024

delta timings (ms after start): tty setup: 1002.7 ms, read configs: 1009.4 ms, query processes: 0.0 ms, first paint: 0.0

@jb55
Copy link

jb55 commented Nov 19, 2024

looks like this a terminal_colorsaurus issue?

bumping colorsaurus to 0.4.5 didn't seem to change anything:

delta timings (ms after start): tty setup: 1005.8 ms, read configs: 1015.7 ms, query processes: 0.0 ms, first paint: 0.0

@bash
Copy link
Contributor

bash commented Nov 19, 2024

rxvt-unicode (urxvt) v9.31

Aha :) terminal-colorsaurus has a quirk for urxvt because the currently released version doesn't properly terminate responses (http://cvs.schmorp.de/rxvt-unicode/src/command.C?revision=1.600&view=markup).

I use the TERM env var to detect urxvt. Do you overwrite the TERM env var by any chance? If so, running TERM=rxvt-unicode delta should also be considerably faster.

@jb55
Copy link

jb55 commented Nov 19, 2024 via email

@bash
Copy link
Contributor

bash commented Nov 19, 2024

Awesome! Strange that TERM was rxvt though—maybe I should add that to terminal-colorsaurus too 🤔

@jb55
Copy link

jb55 commented Nov 19, 2024

looks like I had:

URxvt*termName: rxvt

in my ~/.Xresources

for whatever reason. removing it defauts TERM to rxvt-unicode-256color

@bash
Copy link
Contributor

bash commented Dec 27, 2024

The latest version of terminal-colorsaurus should now works in urxvt regardless of what TERM is set to.

@th1000s
Copy link
Collaborator

th1000s commented Jan 12, 2025

I now had a chance to test vscode's xterm.js - and indeed, it is slow, taking 40ms to respond (often longer), vs. at worst 15ms for konsole connected to the same host via ssh.

Having a lag-free startup is important, so if the tty detection takes longer the result should be cached at ~/.cache/delta/cache-$HOST.env (for ~2 weeks, or a newer delta version) - then print a notice once that a cache was created.

The less --version query -- unlikely to change much, and which can also take 20ms on a cold cache -- can also be moved there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants