-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor grid to measure cache-awareness impact on grid tracing #251
Conversation
519c6a3
to
5bf6d77
Compare
5bf6d77
to
94e583e
Compare
How do you suggest we go about this @glpuga? Are cache friendly storage layouts pluggable like other features in the library? |
I think the changes are worth merging, if only because the relation between the likelihood sensor model and the grid is a bit more clearcut and low-level accesses to the grid were removed. Performance-wise, there's no change. Benchmarks seem to indicate that you'd only see meaningful differences with maps much larger than the one we currently use to measure performance.
They are. To switch between them it's possible to change the storage type parameter of the occupany grid storage mixin. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Damn, that was a long PR 😅 Awesome work @glpuga!
One thought: we are focusing on the effects of data fetching on CPU caches. I wonder what is going on with code fetching. I would expect near zero vtable lookups helping us there.
beluga/include/beluga/sensor/data/cache_friendly_grid_storage.hpp
Outdated
Show resolved
Hide resolved
beluga/include/beluga/sensor/data/cache_friendly_grid_storage.hpp
Outdated
Show resolved
Hide resolved
beluga/include/beluga/sensor/data/cache_friendly_grid_storage.hpp
Outdated
Show resolved
Hide resolved
beluga/include/beluga/sensor/data/cache_friendly_grid_storage.hpp
Outdated
Show resolved
Hide resolved
template <typename... Args> | ||
explicit ValueGrid2Mixin(std::vector<T> data, std::size_t width, double resolution, Args&&... args) | ||
: Mixin(std::forward<Args>(args)...), | ||
data_(std::move(data)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glpuga meta: shouldn't this grid take a storage as well?
const auto xi = static_cast<int>(raster_index % width); | ||
const auto yi = static_cast<int>(raster_index / width); | ||
const auto cell = Eigen::Vector2i(xi, yi); | ||
return std::make_tuple(cell, this->self().data_at(cell).value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glpuga meta: this assumes the underlying grid is at least dense (by rasterizing w/o checking for data availability). FWIW one of the reasons why at the time I refrained from using mixins for the grid hierarchy are these dependencies. They aren't really independent (although I like that we split dense and regular, that coupling was unnecessary).
ROSOccupancyGrid::MapStorage grid_storage(ros_msg_ptr->info.width, ros_msg_ptr->info.height); | ||
for (std::size_t x = 0; x < ros_msg_ptr->info.width; ++x) { | ||
for (std::size_t y = 0; y < ros_msg_ptr->info.height; ++y) { | ||
const auto index = x + y * ros_msg_ptr->info.width; | ||
grid_storage.cell(static_cast<int>(x), static_cast<int>(y)) = ros_msg_ptr->data[index]; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glpuga meta: one thing I don't love about this is that now we unconditionally copy the entire grid. That's OK if grids never update and performance improves. It's not that cool for large grids that change frequently or if there is no cache to leverage and memory is scarce (thinking of microcontrollers). Having a storage that maps an existing memory layout to our grids APIs would be nice. FWIW that's another reason why the grid hierarchy is mostly behavior with no state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see the point, and nothing prevents us from creating a storage that just maps accesses to an existing buffer like we did before if needed, except lack of generality in the code above to leave initialization out of the grid storage scope to isolate it from ROS types.
I'm not too concerned about that kind of systems constrains (low memory + lack of cache) in the beluga_amcl
code above, though; any system that would finds itself under those constraints is unlikely to run ROS in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any system that would finds itself under those constraints is unlikely to run ROS in the first place.
Definitely. As long as we don't propagate the constraint to the core library, I'm good :)
Thanks for the review @hidmic I'll try to address your comments soon. |
94e583e
to
dcceed6
Compare
…rithms Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
…ccupancyGrid Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
Signed-off-by: Gerardo Puga <[email protected]>
06cc1fd
to
b377baa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! CI seems somewhat unhappy though.
// pre-initialize the grid with default values | ||
for (std::size_t index = 0; index < buffer_size_; ++index) { | ||
storage_[index] = T{}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glpuga nit: this shouldn't be necessary, see new expression for array types (in particular, the section on construction). If in doubt, we can always tuck a {}
at the end ie. new T[buffer_size_]{}
. Same elsewhere.
// G G G G H H H H I I ... | ||
// J J J J K K K K L L ... | ||
|
||
if constexpr (true || !is_a_square_power_of_two(LineLength / sizeof(T))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@glpuga hmm, that true ||
seems like a debugging leftover.
@glpuga why close this? It was a good addition. |
Maybe, but it was not adding new functional features, and it's been stalled for months. Almost every file in the MR was in conflict with the current main due to files moving to new locations and other changes, so the amount of refactor needed just to bring this back is considerable. I don't have the time to revive this right now, and having an 8 months old PR pinned in the repo is not really good optics so I'd rather close it. |
Proposed changes
Note: For review of the changes it's probably better to do a first walk-around commit by commit before looking at the changes as a whole.
Type of change
Checklist
Put an
x
in the boxes that apply. This is simply a reminder of what we will require before merging your code.Additional comments
The summary after the usefulness of cache friendly when raytracing is that it does a lot of difference, but only for long rays (over 100m). For shorter rays, the ratio between the linear storage and the best cache-aware storage inverts.
These are the measurements taken by: