Glebashnik/fix tensor adress hash code #32943

glebashnik · 2024-11-25T10:59:59Z

I confirm that this contribution is made under the terms of the license found in the root directory of this repository's source tree and that I have the authority necessary to make this contribution on behalf of its copyright owner.

Replaced custom hash code implementation with the one that is similar to a Java standart library.
The order of labels is reversed when generating hash codes for multi-label addresses.
For some reason this increased performance 8x for matrices with string labels.
Benchmarked with TensorFunctionBenchmark.java.

vekterli · 2024-11-25T16:14:21Z

Looks like a lot of time in the benchmark is spent resolving hash collisions in the binary tree fallback path in HashMap.

My guess is that hashing order impacting performance is due to how it all boils down to Long.hashCode(numeric) in the label, which is the identity function for all numbers using less than 32 bits (it just XOR-folds the top 32 bits on top of the low 32 bits). With Arrays.hashCode() the contribution of the final element hash to the output is a simple addition, so returning the identity of strictly increasing internal label IDs will yield hashes with very little difference.

Example:

Arrays.hashCode(new long[]{-2, -3, -4, -5, -6}) = 29615266
Arrays.hashCode(new long[]{-2, -3, -4, -5, -7}) = 29615267
Arrays.hashCode(new long[]{-2, -3, -4, -5, -8}) = 29615268

Flip the ordering around and you get much more dispersion due to earlier multiplies:

Arrays.hashCode(new long[]{-6, -5, -4, -3, -2}) = 33368866
Arrays.hashCode(new long[]{-7, -5, -4, -3, -2}) = 34292387
Arrays.hashCode(new long[]{-8, -5, -4, -3, -2}) = 35215908

This is very data-dependent—e.g. in the benchmark case the matrix changes most rapidly in the last dimension, but it could just as well have been the other way around.

Here's if I run the benchmark on this branch locally:

Indexed unbound vectors,                     time per join:    0.279 ms
Indexed unbound matrix,                      time per join:    0.423 ms
Indexed bound vectors,                       time per join:    0.267 ms
Indexed bound matrix,                        time per join:    0.467 ms
Mapped vectors,                              time per join:   12.158 ms
Mapped matrix,                               time per join:   18.290 ms
Mapped vectors with string labels,           time per join:   12.222 ms
Mapped matrix with string labels,            time per join:   17.490 ms
Indexed vectors, x space                     time per join:   38.040 ms
Indexed matrix, x space                      time per join:   18.380 ms
Mapped vectors, x space with string labels   time per join:   54.260 ms
Mapped matrix, x space with string labels    time per join:   92.710 ms

Running with the right-to-left hashCode evaluation order "reverted" to left-to-right:

Indexed unbound vectors,                     time per join:    0.293 ms
Indexed unbound matrix,                      time per join:    0.972 ms
Indexed bound vectors,                       time per join:    0.288 ms
Indexed bound matrix,                        time per join:    0.976 ms
Mapped vectors,                              time per join:   11.998 ms
Mapped matrix,                               time per join:  134.430 ms
Mapped vectors with string labels,           time per join:   14.824 ms
Mapped matrix with string labels,            time per join:  121.720 ms
Indexed vectors, x space                     time per join:   40.900 ms
Indexed matrix, x space                      time per join:   20.440 ms
Mapped vectors, x space with string labels   time per join:   59.540 ms
Mapped matrix, x space with string labels    time per join:  459.680 ms

High 🐌 factor as observed.

Let's sneakily change LabelImpl.hashCode() to mix bits around in a blender...

@Override
public int hashCode() {
    // return Long.hashCode(numeric);
    long k = numeric;
    // 64-bit avalanche finalization mix from MurmurHash3, with final fold down to 32 bits
    k ^= k >>> 33;
    k *= 0xff51afd7ed558ccdL;
    k ^= k >>> 33;
    k *= 0xc4ceb9fe1a85ec53L;
    k ^= k >>> 33;
    return Long.hashCode(k);
}

With this change and still with left-to-right eval:

Indexed unbound vectors,                     time per join:    0.293 ms
Indexed unbound matrix,                      time per join:    0.966 ms
Indexed bound vectors,                       time per join:    0.293 ms
Indexed bound matrix,                        time per join:    0.972 ms
Mapped vectors,                              time per join:    7.126 ms
Mapped matrix,                               time per join:   25.250 ms
Mapped vectors with string labels,           time per join:    7.496 ms
Mapped matrix with string labels,            time per join:   20.450 ms
Indexed vectors, x space                     time per join:   39.400 ms
Indexed matrix, x space                      time per join:   18.600 ms
Mapped vectors, x space with string labels   time per join:   62.380 ms
Mapped matrix, x space with string labels    time per join:   95.650 ms

With this change and right-to-left eval (i.e. the only diff from the branch is LabelImpl.hashCode:

Indexed unbound vectors,                     time per join:    0.291 ms
Indexed unbound matrix,                      time per join:    0.985 ms
Indexed bound vectors,                       time per join:    0.291 ms
Indexed bound matrix,                        time per join:    0.973 ms
Mapped vectors,                              time per join:    7.292 ms
Mapped matrix,                               time per join:   21.660 ms
Mapped vectors with string labels,           time per join:    8.214 ms
Mapped matrix with string labels,            time per join:   23.760 ms
Indexed vectors, x space                     time per join:   40.780 ms
Indexed matrix, x space                      time per join:   18.440 ms
Mapped vectors, x space with string labels   time per join:   61.510 ms
Mapped matrix, x space with string labels    time per join:   98.160 ms

Benchmark timings fluctuate a bit between runs, so minor differences shouldn't be read too much into.

Not likely a silver bullet (as with all things), but could be worth looking into as a more "generalized" solution.

vekterli

👍

vespajlib/src/main/java/com/yahoo/tensor/impl/LabelImpl.java

vespajlib/src/test/java/com/yahoo/tensor/TensorAddressTestCase.java

glebashnik assigned vekterli and unassigned vekterli Nov 25, 2024

glebashnik requested a review from vekterli November 25, 2024 11:25

vekterli previously approved these changes Nov 26, 2024

View reviewed changes

vespajlib/src/main/java/com/yahoo/tensor/impl/LabelImpl.java Outdated Show resolved Hide resolved

vespajlib/src/test/java/com/yahoo/tensor/TensorAddressTestCase.java Show resolved Hide resolved

glebashnik dismissed vekterli’s stale review via 28b381a November 26, 2024 13:17

vekterli approved these changes Nov 26, 2024

View reviewed changes

glebashnik added 7 commits November 26, 2024 14:46

Replaced hashing with a a more standard approach.

e8a4924

Increased lable cache stripes count and smallindexsize

48c30b8

removed unused imports

711ad00

wip

0a950bc

Replaced Label hashcode with murmur

f07cbbd

wip

01a46fc

added/fixed comments

d0aca6d

glebashnik force-pushed the glebashnik/fix-tensor-adress-hash-code branch from b2230ba to d0aca6d Compare November 26, 2024 13:46

glebashnik merged commit 9536e24 into master Nov 26, 2024
3 checks passed

glebashnik deleted the glebashnik/fix-tensor-adress-hash-code branch November 26, 2024 14:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Glebashnik/fix tensor adress hash code #32943

Glebashnik/fix tensor adress hash code #32943

glebashnik commented Nov 25, 2024 •

edited

Loading

vekterli commented Nov 25, 2024

vekterli left a comment

Glebashnik/fix tensor adress hash code #32943

Glebashnik/fix tensor adress hash code #32943

Conversation

glebashnik commented Nov 25, 2024 • edited Loading

vekterli commented Nov 25, 2024

vekterli left a comment

Choose a reason for hiding this comment

glebashnik commented Nov 25, 2024 •

edited

Loading