Fast, error-aware, and thread-safe 1D/2D/3D histograms that are also compatible with StatsBase.Histogram
- 0.11
- Breaking change to the main constructor API, now the API is sanely divided into "constructor for empty histogram" and "make histogram given data". See documentation.
- 0.10
- due to huge latency issue, UnicodePlots.jl (for text based terminal display) has been dropped.
julia> using FHist
julia> a = randn(1000);
julia> h1 = Hist1D(a; binedges = -4:1.0:4)
edges: -4.0:1.0:4.0
bin counts: [0, 23, 123, 377, 320, 134, 21, 2]
total count: 1000
julia> h2 = Hist1D(; counttype = Int, binedges = -4.0:1.0:4.0);
julia> push!.(h2, a);
julia> Threads.@threads for v in a
atomic_push!(h2, v) # thread-safe
end
julia> h1+h1 == h2
true
Additionally, one can specify overflow=true
when creating a histogram to clamp out-of-bounds values into
the edge bins.
julia> Hist1D(rand(1000); binedges = 0:0.2:0.9, overflow=true)
edges: 0.0:0.2:0.8
bin counts: [218, 183, 198, 401]
total count: 1000
Single-threaded filling happens at ~1 GHz on modern CPU (Ryzen 7 7840HS)
julia> a = randn(10^6);
julia> @benchmark Hist1D(a; binedges = -3:0.01:3)
BenchmarkTools.Trial: 4624 samples with 1 evaluation.
Range (min … max): 1.049 ms … 2.035 ms ┊ GC (min … max): 0.00% … 0.00%
julia> using FHist, Statistics
julia> h1 = Hist1D(randn(10^4).+2; binedges = -5:0.5:5);
julia> h1 = (h1 + h1*2)
edges: -5.0:0.5:5.0
bin counts: [0, 0, 0, 0, 0, 3, 3, 39, 153, 540, 1356, 2640, 4551, 5757, 5727, 4494, 2652, 1326, 546, 168]
total count: 29955
julia> bincenters(h1)
-4.75:0.5:4.75
julia> println(bincounts(h1))
[0, 0, 0, 0, 0, 3, 3, 39, 153, 540, 1356, 2640, 4551, 5757, 5727, 4494, 2652, 1326, 546, 168]
julia> println(h1.sumw2);
[0.0, 0.0, 0.0, 0.0, 0.0, 5.0, 5.0, 65.0, 255.0, 900.0, 2260.0, 4400.0, 7585.0, 9595.0, 9545.0, 7490.0, 4420.0, 2210.0, 910.0, 280.0]
julia> nbins(h1), integral(h1)
(20, 29955)
julia> mean(h1), std(h1)
(1.993865798698047, 1.0126978885814524)
julia> median(h1), quantile(h1, 0.5)
(1.7445284002084418, 1.7445284002084418)
julia> lookup.(h1, [-1.5,0,2.5]) # find bin counts for given x-axis values
3-element Vector{Int64}:
39
1356
4494
julia> Hist1D(sample(h1; n=100))
edges: -1.0:1.0:5.0
bin counts: [4, 15, 41, 28, 10, 2]
total count: 100
julia> h1 |> normalize |> integral
1.0
julia> h2 = Hist2D((randn(10^4),randn(10^4)); binedges = (-5:5,-5:5))
edges: (-5:5, -5:5)
bin counts: [0 0 … 0 0; 0 0 … 0 0; … ; 0 0 … 0 0; 0 0 … 0 0]
total count: 10000
julia> nbins(h2)
(10, 10)
julia> project(h2, :x)
edges: -5:5
bin counts: [0, 14, 228, 1345, 3399, 3462, 1300, 237, 15, 0]
total count: 10000
julia> h2 |> profile(:x) # or pipe
edges: -5:5
bin counts: [0.0, -0.21428571428571427, 0.0, 0.03382899628252788, 0.0025007355104442485, 0.012709416522241479, 0.018461538461538463, 0.035864978902953586, 0.1, 0.0]
total count: -0.010920048606008592
julia> h2 |> rebin(10,5)
edges: (-5.0:10.0:5.0, -5.0:5.0:5.0)
bin counts: [4953 5047]
total count: 10000
julia> h3 = Hist3D((randn(10^4),randn(10^4),randn(10^4)); binedges = (-5:5,-5:5,-5:5))
Hist3D{Int64}, edges=(-5:5, -5:5, -5:5), integral=10000
julia> h3 |> project(:x) |> project(:x) |> std
1.051590599996025