Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON.parse(read(io, String)) is faster than JSON.parse(io) #339

Open
fonsp opened this issue Mar 8, 2022 · 1 comment
Open

JSON.parse(read(io, String)) is faster than JSON.parse(io) #339

fonsp opened this issue Mar 8, 2022 · 1 comment

Comments

@fonsp
Copy link
Contributor

fonsp commented Mar 8, 2022

Based on my small test, it looks like JSON.parse(io) is 2x slower, and has 2x more allocations than first reading the IO into memory, and then calling JSON.parse on the resulting String.

Benchmark

using BenchmarkTools

# sample data
const sample = read(download("https://registry.npmjs.org/react"), String)

j1(io) = JSON.parse(read(io, String))
j2(io) = JSON.parse(io)

@benchmark let
	io = IOBuffer()
	write(io, $sample)
	seekstart(io)

	r = j1(io) # change to j1 or j2

	close(io)
	r
end

Results:

JSON.parse(read(io, String))

BenchmarkTools.Trial: 118 samples with 1 evaluation.
 Range (min … max):  38.245 ms … 60.927 ms  ┊ GC (min … max): 0.00% … 26.18%
 Time  (median):     39.169 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   42.542 ms ±  5.252 ms  ┊ GC (mean ± σ):  8.21% ±  9.87%

  ▅█▁                                                          
  ███▅▅▅▃▃▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▄▄▃▃▃▃▄▁▃▄▃▅▃▄▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▃▁▁▃ ▃
  38.2 ms         Histogram: frequency by time        56.1 ms <

 Memory estimate: 22.80 MiB, allocs estimate: 274058.

JSON.parse(io)

BenchmarkTools.Trial: 69 samples with 1 evaluation.
 Range (min … max):  66.057 ms … 85.004 ms  ┊ GC (min … max): 0.00% … 21.08%
 Time  (median):     67.862 ms              ┊ GC (median):    0.00%
 Time  (mean ± σ):   72.698 ms ±  6.664 ms  ┊ GC (mean ± σ):  8.17% ±  8.06%

  █▃▃                           ▁                              
  ████▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▄▆▄█▃▃▁▁▁▁▁▃▁▁▁▁▁▁▁▁▄▇▆▃▇▃▁▁▁▃▁▃ ▁
  66.1 ms         Histogram: frequency by time        84.5 ms <

 Memory estimate: 39.45 MiB, allocs estimate: 458500.

Based on this benchmark, it looks like an easy performance improvement to JSON.jl will be to change this to just read into memory and use the String method, i.e.:

parse(io::IO; kwargs...) = parse(read(io, String); kwargs...)
@quinnj
Copy link
Member

quinnj commented Mar 10, 2022

One consideration is that we probably can't assume that calling read(io, String) is always safe; i.e. it might be a multi-gigabyte file that would completely fill up memory. Or perhaps the io stream is only partially JSON, but might be followed by some other kind of format, so it wouldn't make sense to read the entire IO and assume 100% json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants