⚡️ Speed up _make_forest_dict()
by 77% in scanpy/neighbors/__init__.py
#2971
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
📄
_make_forest_dict()
inscanpy/neighbors/__init__.py
📈 Performance improved by
77%
(0.77x
faster)⏱️ Runtime went down from
5670.84μs
to3195.82μs
Explanation and details
I have used
numpy.array
andnumpy.concatenate
for your sizes and dat object which are much faster thannumpy.fromiter
and assignation respectively, especially when dealing with a large dataset. The sizes of your data_list are computed only once and used where needed. Which results in runtime improvements compared to previous code, where data sizes were computed multiple times in different parts of the code.Correctness verification
The new optimized code was tested for correctness. The results are listed below.
✅ 8 Passed − 🌀 Generated Regression Tests
(click to show generated tests)
This optimization was discovered by Codeflash AI ⚡️