-
-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Render on demand #2336
Render on demand #2336
Conversation
Compile Times benchmarkNote, that these numbers may fluctuate on the CI servers, so take them with a grain of salt. All benchmark results are based on the mean time and negative percent mean faster than the base branch. Note, that GLMakie + WGLMakie run on an emulated GPU, so the runtime benchmark is much slower. Results are from running: using_time = @ctime using Backend
# Compile time
create_time = @ctime fig = scatter(1:4; color=1:4, colormap=:turbo, markersize=20, visible=true)
display_time = @ctime Makie.colorbuffer(display(fig))
# Runtime
create_time = @benchmark fig = scatter(1:4; color=1:4, colormap=:turbo, markersize=20, visible=true)
display_time = @benchmark Makie.colorbuffer(display(fig))
|
Cool, that might just work without too much effort :) Let me know if you have any questions, I could try to take a look as well... |
Texture, TextureBuffer, GLBuffer and VertexArray now track when they are updated and forward that to RenderObject. That's all the GPUArray subtypes other than GPUVector, which seems unused. I think with that the pr is basically done already. I also went through every plot primitive to check if each uniform is triggering updates. I skipped the ones which I believe to be constant, irrelevant or inaccessible. List of ignored attributesGeneral
scatter
Lines, Linesegments
mesh
surface
text
heatmap/image
volume
meshscatter
|
I guess another thing to think about with this is whether all the |
Should help with #678 (on any OS), maybe also #267 and #1683. Should also help with idle gpu utilization in #1427, but I don't think it'll help with #1412. Whether this fixes any of those issues is kind of subjective I guess. This pr should make raw rendering a little bit slower through the update tracking, but cuts down idle gpu usage to 0. So in an interactive or animated situation things shouldn't be any better, but otherwise they should be a lot better. For example |
I did a bit of benchmarking with scene = Scene()
for _ in 1:100
scatter!(scene, rand(10), marker = Rect)
end
display(scene, render_on_demand = false)
function update!(scene)
for p in scene.plots
p.ssao[] = !p.ssao[]
p.overdraw[] = !p.overdraw[]
p.color[] = :red
p.markersize[] = 10f0
p.marker[] = Rect
end
end
@benchmark update!($scene) That looks to be 3-8% slower with this pr. (Min , mean, median time going from (1.38, 1.96, 1.94) ms -> (1.43, 2.07, 2.03) ms) |
I added some code to disable update tracking when the other render loops are used. (Those can't be changed dynamically, right?) I also noticed that I wasn't disabling |
# This is a lazy workaround for disabling updates of `requires_update` when | ||
# not rendering on demand. A cleaner implementation should probably go | ||
# through @gen_defaults! and adjust constructors instead. | ||
track_updates = to_value(pop!(data, :track_updates, true)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gen_defaults
is pretty much deprecated (without a new implementation yet, but I think it's pretty clear that it's pretty redundant), so I guess that's fair ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it works atm is that GPUArrays always attach the update trackers when they get created. When the renderobject is created it either removes the tracking (i.e. observer functions) or connects the remaining uniforms based on what would usually be interpreted as a uniform. Seems quite hacky to me 🤷
A clean implementation wouldn't connect (and maybe also not define) the tracking observables in GPUArrays. I think with how things are organized atm @gen_defaults
would need to be adjusted to forward that information. But I don't really want to mess with that macro. I'd probably just break it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant, @gen_defaults
needs to be refactored / removed going forward, so whenever we do that, we can clean up the implementation of track_updates
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words: please don't waste a second on @gen_defaults
;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I think the pr is ready then
@@ -193,8 +193,10 @@ max_dim(t) = error("max_dim not implemented for: $(typeof(t)). This happen | |||
|
|||
function (::Type{T})(x::Observable; kw...) where T <: GPUArray | |||
gpu_mem = T(x[]; kw...) | |||
on(x-> update!(gpu_mem, x), x) | |||
gpu_mem | |||
# TODO merge these and handle update tracking during contruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have this todo, and not just:
on(data) do new_data
update!(gpu_mem, new_data)
gpu_mem.requires_update[] = true
end
(I already have this locally, will push some changes like this later)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I separated them so the update tracking could be removed again when using fps_renderloop
@@ -289,11 +297,14 @@ mutable struct RenderObject{Pre} | |||
prerenderfunction::Pre | |||
postrenderfunction | |||
id::UInt32 | |||
requires_update::Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should consistently use the same type for requires_update
, and I don't think we need an observable, so maybe use Base.RefValue
consistently instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see, some do need to be an observable..So maybe use Observable consistently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For GPUArray
we either need an observable to forward to the information to the renderobject or we need to search through a bunch of stuff every potential frame. Or we need to make their updates aware of the renderobject but that might be a lot of work?
Otherwise I went with Bool since the structs are already mutable. I think that's a bit faster than a Refvalue in a mutable struct too...
end | ||
on(_ -> robj.requires_update = true, vertexarray.requires_update) | ||
else | ||
# remove tracking from GPUArrays |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why that's needed, could you elaborate the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugh, classic case of not seing the whole picture in the github diff...sorry for the noise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it feels cleaner to me, to just let the GPUArrays track their updates, which should be very cheap, but then not connect them further if we don't track updates...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on(_ -> robj.requires_update = true, vertexarray.requires_update)
This? If all the buffers are also in uniforms it shouldn't be necessary. I added that early on when I was searching for things that need to be tracked
# remove tracking from GPUArrays
The way I added tracking it gets intialized for every renderloop. That can be a lot of extra useless callbacks and from my testing it can be noticeable (like 5-10% slower with fps_renderloop in what should be a bad case). I tried to restore performance again by removing tracking when it's not needed. That's what the code under the comment does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I actually removed all the tracking in one go so I'm not sure how much the GPUArrays contribute. The test case I used also didn't trigger updates there, but in simple observables since that should have a higher ratio of tracking cost / total update cost and I wanted to know how bad it can get.
I was planning to test having GPUArray updates wait for frames in the future, which would probably require something like requires_update
too. So maybe this will end up as always on/tracking later anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using BenchmarkTools, GLMakie
scene = Scene()
for _ in 1:100
heatmap!(scene, rand(16, 16))
end
display(GLMakie.Screen(render_on_demand = false), scene)
function update!(scene)
for p in scene.plots
notify(p[3])
end
end
@benchmark update!($scene)
With cleanup
BenchmarkTools.Trial: 9326 samples with 1 evaluation.
Range (min … max): 381.502 μs … 32.987 ms ┊ GC (min … max): 0.00% … 98.11%
Time (median): 502.491 μs ┊ GC (median): 0.00%
Time (mean ± σ): 532.654 μs ± 812.194 μs ┊ GC (mean ± σ): 3.84% ± 2.49%
▁ ▃▄▇█▄▃▁▁▁▂▁▁
▁▁▅▄█▅▄▄▄▃▄▄▄▄▄▆█████████████▇▇▇▆▇▆▆▅▅▅▅▅▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁ ▄
382 μs Histogram: frequency by time 694 μs <
Memory estimate: 121.88 KiB, allocs estimate: 900.
Without cleanup (the else branch commented out)
BenchmarkTools.Trial: 4238 samples with 1 evaluation.
Range (min … max): 803.314 μs … 34.882 ms ┊ GC (min … max): 0.00% … 96.12%
Time (median): 1.153 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.174 ms ± 745.947 μs ┊ GC (mean ± σ): 1.33% ± 2.08%
▂▃▅▅▅█▃▃▃▄▃▃▃▃▄▄▄▃▃▄▃▃▇▃▅▇▄▃▄▄▃▃▁▂
▁▄▄▆▄▅▅▃███████████████████████████████████▆▇▆▅▅▄▄▃▄▄▃▂▂▃▃▂▂▂ ▅
803 μs Histogram: frequency by time 1.6 ms <
Memory estimate: 128.12 KiB, allocs estimate: 1300.
Without cleanup & single observer function
BenchmarkTools.Trial: 4747 samples with 1 evaluation.
Range (min … max): 757.975 μs … 31.849 ms ┊ GC (min … max): 0.00% … 96.00%
Time (median): 1.006 ms ┊ GC (median): 0.00%
Time (mean ± σ): 1.048 ms ± 783.416 μs ┊ GC (mean ± σ): 1.84% ± 2.42%
▃▄▇▅▅█▅▅▅▂▃ ▃▁▁▁▁ ▁▁▂▁▂▁▁ ▁
▅▃▆▅▄▆▅▆████████████████████████████████▆▆▆▄▄▅▆▄▄▄▃▃▃▃▂▂▂▂▁▂▂ ▅
758 μs Histogram: frequency by time 1.43 ms <
Memory estimate: 128.12 KiB, allocs estimate: 1300.
That's much more than I thought it would be...
GLMakie/src/screen.jl
Outdated
|
||
function requires_update(screen::Screen) | ||
for (_, _, robj) in screen.renderlist | ||
visible = Bool(to_value(get(robj.uniforms, :visible, true))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we may want to use this PR to change the visible field into a well typed, mandatory field for RenderObject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if I did what you had in mind but I moved it to a field.
I also had the logic for re-rendering with visible wrong. I tried to set it up in a way that invisible renderobjects can't trigger updates, but the way I did it caused p.visible[] = false
to not trigger a re-render. I fixed that by moving the check to the renderobject updater functions.
I reran the benchmarks I posted before and now things are more equal. Maybe the things I have running in the background affect this, not sure. To summarize:
So in a workflow with lots of GPUArray updates, the pr is now does slightly better than master with |
Merged in #2397 |
I did some tests and plotting a 3D volume |
Description
This is an attempt at making a "render on demand" mode for GLMakie. The idea is have all renderobject input observables update a flag
requires_update
which is checked in the renderloop. Rendering only takes place if this flag is true for at least one (visible) renderobject.The basic idea already works, but it's currently not catching all updates.
For example changes to a mesh aren't tracked.I guess the main task will be to find all of those.Type of change
Checklist