ungron is slow and leaky #93

csabahenk · 2022-02-02T01:07:38Z

I'm using the following Ruby script named bintreefeed.rb as data generator:

#!/usr/bin/env ruby

Integer($*.shift).then { |n|
  (0...(1 << n)).each { |k|
    (0...n).map { |i|
      (k >> i) & 1
    }.then { |a|
      p([a, "x"])
    }
  }
}

It takes a single numeric argument N, and produces the gronned description of a complete binary tree of height N (using the JSON format). Example:

$ ./bintreefeed.rb 3
[[0, 0, 0], "x"]
[[1, 0, 0], "x"]
[[0, 1, 0], "x"]
[[1, 1, 0], "x"]
[[0, 0, 1], "x"]
[[1, 0, 1], "x"]
[[0, 1, 1], "x"]
[[1, 1, 1], "x"]

$ ./bintreefeeed 3 | gron -u -j -m 
[
  [
    [
      "x",
      "x"
    ],
    [
      "x",
      "x"
    ]
  ],
  [
    [
      "x",
      "x"
    ],
    [
      "x",
      "x"
    ]
  ]
]

Increasing N we can observe that gron becomes extremely slow and bloated. Eg. the N=20 results are on my machine:./bintreefeed 20 | gron -u -j -m > /dev/null takes 30 seconds and peaks at 5Gb footprint. For comparison, I have a Ruby reimplementation of gron at https://github.com/csabahenk/gron_rb. Using its gron.rb script, ./bintreefeed 20 | ./gron.rb -u > /dev/null takes 7 second and the footprint is around 300 Mb. Checked also with N=21: gron runs 58 seconds with 14Gb footprint, while gron.rb runs 22 seconds with 480 Mb footprint.

Notes:

tested with current gron release (0.6.1) and master (6d4fe18)
a difference between gron and gron.rb, that with -u former produces indented JSON, while latter produces compact JSON. To get indented JSON with gron.rb, one can use the following oneliner (executed in the gron_rb repo):
```
ruby -I. -rgron -rjson -e 'Gron::Ungron.new{|u| $<.each{|l| u << JSON.load(l)}}.tree.then{|t| JSON.pretty_generate t}.then{|s| puts s}'
```
Using this, the runtime with N=20 increases to 10 seconds and the footprint to 380 Mb — not a significant difference.

The text was updated successfully, but these errors were encountered:

rjp · 2022-02-06T20:54:02Z

I'll have a look at this for my memory-efficient fork. Looks like it's probably reasonably straightforward to fix this (especially given the helpful gron.rb example, thanks!)

rjp · 2022-02-23T15:34:09Z

With N=21, I've got it down to 12s and 459M. Without frequent runtime.GC() calls (which obviously slows things down but does help - N=24 is 135s, 3.2G with a GC every 1M input lines; 116s, 3.6G without), I can't really see any obvious way to get the memory usage down any more.

csabahenk · 2022-03-10T01:45:26Z

That sounds cool. Do you plan to merge these adjustments to the code base? (If not yet suitable for prime time, then at least on a topic branch?)

rjp · 2022-03-10T04:25:40Z

I'll try and get a branch up on my fork this week.

rjp · 2022-03-12T22:28:03Z

Couple of days delay - had an epiphany earlier which might bring the memory usage right down if I can rearchitect the internals appropriately. But it looks promising from today's quick testing...

rjp · 2022-03-22T14:26:34Z

Up on my fork, f/reduce-memory branch. gron -u -j -e to activate the low memory variant.

n=21 is now at ~9s, 415M with a forced GC every 2M lines or 405M with GC every 1M lines (compared to ~11s, 360M for the Ruby.)

n=24 comes in at ~94s, 3.3G with 2M GC, or ~102s, 2.9G with 1M GC (compared with ~110s, 1.7G for the Ruby.)

Output verified as correct by feeding them all through jq -S | openssl dgst -blake2b512 and comparing the hashes.

Falls off a CPU (but not memory) cliff with n=25 - it's not the number of lines because I can feed it 75M lines of a different file with no problem. I suspect the depth of the structure and resultant cavalcade of pointers are causing the GC some internal discomfort. Investigations continue but it's mostly good to go for normal workloads now.

(All benchmarks run on a 2020 M1 Pro with arm64 binaries, Go 1.17)

adamritter · 2023-06-03T08:34:08Z

Hi @csabahenk @rjp ,

as an alternative option you can try my project, https://github.com/adamritter/fastgron (written in C++).

It doesn't support -j (JSON based output format), but I converted the file that was outputted by ./bintreefold 21, and got these timings:

time gron -u g.gson > /dev/null
gron -u g.gson > /dev/null 51.35s user 46.38s system 157% cpu 1:01.91 total

time fastgron -u g.gson > /dev/null
fastgron -u g.gson > /dev/null 2.38s user 0.34s system 97% cpu 2.796 total

(for the gron case fastgron is getting even more speedup, as it uses a SIMD optimized library for JSON parsing)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ungron is slow and leaky #93

ungron is slow and leaky #93

csabahenk commented Feb 2, 2022 •

edited

Loading

rjp commented Feb 6, 2022

rjp commented Feb 23, 2022

csabahenk commented Mar 10, 2022 •

edited

Loading

rjp commented Mar 10, 2022

rjp commented Mar 12, 2022

rjp commented Mar 22, 2022 •

edited

Loading

adamritter commented Jun 3, 2023

ungron is slow and leaky #93

ungron is slow and leaky #93

Comments

csabahenk commented Feb 2, 2022 • edited Loading

rjp commented Feb 6, 2022

rjp commented Feb 23, 2022

csabahenk commented Mar 10, 2022 • edited Loading

rjp commented Mar 10, 2022

rjp commented Mar 12, 2022

rjp commented Mar 22, 2022 • edited Loading

adamritter commented Jun 3, 2023

csabahenk commented Feb 2, 2022 •

edited

Loading

csabahenk commented Mar 10, 2022 •

edited

Loading

rjp commented Mar 22, 2022 •

edited

Loading