Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent hashing question #25

Open
dliebner opened this issue Dec 1, 2021 · 5 comments
Open

Consistent hashing question #25

dliebner opened this issue Dec 1, 2021 · 5 comments

Comments

@dliebner
Copy link

dliebner commented Dec 1, 2021

This isn't an issue with the library, just a question about how consistent hashing works with weights. If I change the weight of a target by some small amount, let's say out of 10 targets each with a weight of 1, I change the weight of one target to 3. How significantly will the keys be redistributed?

@pda
Copy link
Owner

pda commented Dec 1, 2021

Here's the relevant code:

// hash the target into multiple positions
for ($i = 0; $i < round($this->replicas * $weight); ++$i) {
$position = $this->hasher->hash($target.$i);
$this->positionToTarget[$position] = $target; // lookup
$this->targetToPositions[$target] [] = $position; // target removal
}

Based on that, and some very hazy memory:

For a given weight w, that target is w times more likely to be picked compared to w = 1.
So in your example, the regular targets have 1 / 12 = 8.3% chance, and the boosted one has 3/12 = 25% chance.
(The denominator is 12 because 9 targets of weight=1 plus 1 target of weight=3).
You can use non-integer weight for more subtle distribution, like 1.5.

@dliebner
Copy link
Author

dliebner commented Dec 1, 2021

Thanks! So a higher weight => more distributions on the ring, as chosen by $target.$i, so the original placements will remain the same, only new ones will be added, "stealing" keys from other targets in a fairly distributed manner. Am I understanding that correctly?

@R-omk
Copy link

R-omk commented Dec 1, 2021

This library is not really about "Consistent hash", read more https://en.wikipedia.org/wiki/Rendezvous_hashing

@pda
Copy link
Owner

pda commented Dec 2, 2021

FYI in case this is helpful, this was the original inspiration / reference material for the library 14 years ago:

https://web.archive.org/web/20080331024845/http://www.spiteful.com/2008/03/17/programmers-toolbox-part-3-consistent-hashing/

This is the part that translated to $weight in flexihash:

You can tune the amount of load you send to each server based on that server’s capacity. Imagine this spatially – more points for a server means it covers more of the ring and is more likely to get more resources assigned to it.

Mostly I found the diagrams in that blog post really helpful in forming a mental model.

@dliebner
Copy link
Author

dliebner commented Dec 2, 2021

Thanks for that! I plan to implement a bounded-load layer on top of this once it becomes necessary. For anyone else out there, here are the resources I drew from when learning about consistent hashing:
https://www.toptal.com/big-data/consistent-hashing
https://medium.com/vimeo-engineering-blog/improving-load-balancing-with-a-new-consistent-hashing-algorithm-9f1bd75709ed
https://ai.googleblog.com/2017/04/consistent-hashing-with-bounded-loads.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants