-
Notifications
You must be signed in to change notification settings - Fork 612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache-friendly prime table #31993
Cache-friendly prime table #31993
Conversation
Note: I'd avoid using double underscores as well since they are supposed to be reserved by compilers. |
@@ -4,33 +4,32 @@ | |||
|
|||
namespace { | |||
|
|||
static const unsigned long __stl_prime_list[] = | |||
static const size_t __chunk_size{8}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further modernize:
- Remove static, anonymous namespace has similar effect.
- Rename to avoid leading underscore.
- Use constexpr
}; | ||
|
||
static const unsigned long __upper_bound_list[] = {1543ul, 393241ul, 100663319ul, 4294967291ul}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Avoid duplicating constants. I guess you can initialize by refering to last elements in stl_prime_list. Like {stl_prime_list[0][chunk_size - 1], ...}
size_t | ||
hashtable_base::getModuloStl(size_t size) noexcept | ||
{ | ||
return getModulo(size, __stl_prime_list, sizeof(__stl_prime_list)/sizeof(__stl_prime_list[0])); | ||
for(size_t i = 0; i < __upper_bound_list_size; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Standard is to use space after for, if, etc. Like
for ()
If this is a critical piece of code, it's much faster to find the relevant part of the list straight from the order of the msb: https://quick-bench.com/q/9W-F5Q5sCBJr96PXjsPq2a9PFGU, and then comparing the only two relevant elements. I have no idea how to write C++, though, so don't look too hard at the details :p Also, this is obviously only true while the list has the property that the primes are between consecutive powers of two (except for the first one, which is handled with the |
I have never observed this as consuming time. Perhaps @blacktea can add some info on how this was observed ? |
@blacktea Do you intend to complete this one anytime soon ? |
Implemented msb index lookup in #32168, closing |
I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.
This PR optimizes the getting modulo method. Based on the benchmarks https://quick-bench.com/q/-bPdO1GtFx1DQ6Ve5dZhaah_eso, the performance increased by up to 4 times.
Method:
The original
__stl_prime_list
table is split into arrays of 8 elements. It allows the cache line to accommodate these arrays decreasing cache misses.The
__upper_bound_list_size
array is used to find a nested array where the givennewSize
is located.Testing:
The method is checked here: https://godbolt.org/z/bd7rM71f6