-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong order after rdst #12
Comments
Thanks! I've grabbed the file and managed to replicate it. Still trying to work out the cause though. It takes a while, the issue appears right at the end of the dataset, and when taking a subset of the data it disappears 😅 It looks like it's possibly something to do with one of the layer skips in LsbSort, so unfortunately for data that big there isn't really a great replacement for that algorithm. You could limit the sort to just Regions sort and Ska sort for now, but it's a lot slower for that data. |
Ok thanks! BTW, any possibility of getting a stable version? 😏 |
Depends on if you mean a stable library or stable sort order 😅 . For the former, I've released 0.20.13 that includes a fix for the ordering issue here. It was an issue with the combination of two optimizations (level skip + alternating input vs. output arrays), where counting wasn't alternating input arrays. Counts were correct (same data), but the flag / check for if the data was "already sorted" wasn't. If you mean stable sort order, give
|
The second thing LOL. Thanks for the fix! However, in the end we realized unstable is fine. Any suggestion for optimizing the sort for a few billion pairs (usize, usize)? |
The most important thing would be to make the RadixKey impl. as fast as possible. If you can flip the order of the pair of usize, and if you're on exclusively little-endian machines, something like this is probably faster:
This function is used in some very hot loops, so a single conditional or divide can have a major impact. Beyond that, the standard tuner is still in the same ballpark as other options I tried when running it against that billion pairs dataset you provided. Occasionally a different tuning was a little faster / slower, but nothing consistently so. Depending on the machine you're running on, the included low_mem tuner could be faster though. Most of my runs were on a m1 macbook, which seems to have great memory performance (both bandwidth and latency). On a mid-range Ryzen it's a different story and memory can be a bit of a bottleneck, so the low_mem tuner wins out more often. If you test the standard tuner, low_mem tuner and also that MtLsb option in the comment above, then you'll have a pretty good snapshot of what options exist. More complex tunings probably won't have a great impact beyond those three options. |
Thanks! As is it it's two times faster than par_sort_unstable, so we're already very happy. We'll keep experimenting. Thanks for a great crate! |
We have a file containing a billion pairs of usize that rdst fails to sort properly. You can get the data at http://vigna.di.unimi.it/data.tsv.gz (you'll have to gunzip it), and the program to replicate the problem is
which prints
The text was updated successfully, but these errors were encountered: