-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix!: algorithm for oneOfWeighted has better distribution #47
base: master
Are you sure you want to change the base?
Conversation
oneOfWeighted.js
Outdated
if (flip(id, p)) { | ||
return resolve(id, sample[1]) | ||
var id = hash(input, 'oneOfWeighted') | ||
const prob = (id % 1_000_000) / 1_000_000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hash()
gives us back a 52-bit int, so I think we might be truncating the range with the / 1_000_000
, which could affect the quality of the results. Could we do id / Number.MAX_SAFE_INTEGER
here instead?
edit Turns out I was wrong here, reply below this one has more details.
const prob = (id % 1_000_000) / 1_000_000 | |
var prob = id / (2 ** 52) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out my suggestion fails the tests (the diff threshold is greater than 6.5%), whereas yours doesn't. I can't understand why. So lets go for your change - scratch my suggestion :) Can you help me understand why 1M works better?
Only suggested change then is to use es5 (context here).
const prob = (id % 1_000_000) / 1_000_000 | |
var prob = (id % 1000000) / 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried today again many different settings and don't understand it at all. I figured in the end it doesn't really matter because only the probabilities the user defined in the array are important. It's unlikely they would define values like 0.000004
and want to run the function more than a million times. Right now, somehow 1 million gave the best results with many different things I tried.
On the other hand: If we break backwards compatibility we should do it right the first time. So I'll take some more time to look into alternatives.
oneOfWeighted.js
Outdated
let cumulative = 0; | ||
for (const [probability, value] of samples) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When I created this project 5 years ago, I had the brainwave of writing it only in es5 to avoid a build step. I'm not sure if that decision was the right one, but unfortunately it means for now we need to write the code as es5 for linting to pass. I'm sorry about this, I realise it is absurd to ask someone to write es5 in 2025 :)
let cumulative = 0; | |
for (const [probability, value] of samples) { | |
var cumulative = 0; | |
for (var i = 0; i < samples.length; i++) { | |
var probability = samples[i][0]; | |
var value = samples[i][1]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alright. happy to comply.
tests/oneOfWeighted.test.js
Outdated
@@ -29,6 +30,128 @@ test(`averages to within ${ | |||
t.assert(diffBetween(sums.blue / n, 0.3) <= DIFF_THRESHOLD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.assert(diffBetween(sums.blue / n, 0.3) <= DIFF_THRESHOLD) | |
t.assert(diffBetween(sums.green / n, 0.2) <= DIFF_THRESHOLD) | |
t.assert(diffBetween(sums.blue / n, 0.1) <= DIFF_THRESHOLD) | |
// maybe yellow and fuchsia here too? |
Looks great :) Just a few minor comments. |
@ctholho I should really add contribution guidelines and CI for this project. For now, can you do these to run checks for your changes and update the docs and tests?
|
Co-authored-by: Justin van der Merwe <[email protected]>
fixes #46