Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Addresses #719
Brings the Itertools
combinations
method.Also brings the
array_combinations
method. This one is not yet part of the public API of Itertools, but it is already available on their master branch. Since using arrays prevents unnecessary allocations I found it convenient to add it already, given Rayon focus on performance. Also it allows for eager code sharing between both methods.I have chosen not to implement the
tuple_combinations
method from Itertools at this time, since apparentlytuple_*
functions are there to offer a similar functionality to arrays before const generics were a thing. There are proposals to migrate toarray_*
methods.While itertools can encapsulate all the combinations logic within the
increment_indices
method, in Rayon this is a bit more tricky, since a kind of random access is required. Thus, this implementation usesunrank
which computes the necessary indices for an arbitrary offset.Itertools uses a buffered list to collect the previous iterator in a lazy way in order to be able to address arbitrary indices needed to seed the different combinations. This has been mentioned in #719. I haven't done something similar here as I don't see a way to collect a
ParallelIterator
in a lazy way, so the proposed implementation just collects the previous iterator into anArc<[T]>
. This is still better than the partial solution found in the issue, since this way there is no need to store all possible combinations in memory, just the base iterator which seeds them. Also, the combinations themselves are computed in parallel.Handling overflows
Due to the factorial nature of combinations, overflows can occur easily. Both, Itertools and this PR use a checked version of the binomial function in order to detect such overflows. I have decided to panic in this case, since doing more than
usize::MAX
iterations on any 64 machine is probably just too much. Also, the code will panic if the length of the previous iterator is smaller than the k (length of each combination), as this is probably a bug in the caller implementation, and I find too inconvenient returning a Result or an Option with the Iterator. Feel free to comment about this.Extensive testing
This PR includes an extensive set of tests, as I found it particularly challenging to get algorithms of this kind correct on the first try. Feel free to suggest pruning any unnecessary tests. The tests are relatively fast and do not significantly impact the overall test suite runtime though.