Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve eachi for sorted_set #1456

Merged
merged 1 commit into from
Jan 10, 2025

Conversation

illusory0x0
Copy link
Contributor

@illusory0x0 illusory0x0 commented Jan 9, 2025

Benchmark Task [fast_eachi] Count = 10
----------------------------------------
Average Time: 0.04309862s/per test
Max Time Per Test: 0.0475998s/per test
Min Time Per Test: 0.0427086s/per test
----------------------------------------
Benchmark Task [eachi] Count = 10
----------------------------------------
Average Time: 0.0559544s/per test
Max Time Per Test: 0.0558127s/per test
Min Time Per Test: 0.0287929s/per test
----------------------------------------
Total tests: 31, passed: 31, failed: 0.
test "benchmark" {
  let criterion = @bm.Criterion::new()
  let test_data = new()
  let rand = @random.new()
  for i in 0..<100000 {
    test_data.add(rand.int())
  }
  criterion.add(
    @bm.Task::new("fast_eachi", fn() {
      let mut b = false
      for _ in 0..<20 {
        test_data.fast_eachi(fn(_, v) { b = b || v % 2 == 0 })
      }
    }),
  )
  criterion.add(
    @bm.Task::new("eachi", fn() {
      let mut b = false
      for _ in 0..<20 {
        test_data.eachi(fn(_, v) { b = b || v % 2 == 0 })
      }
    }),
  )
  let result = criterion.run()
  println(result["fast_eachi"].unwrap())
  println(result["eachi"].unwrap())
}

Copy link

peter-jerry-ye-code-review bot commented Jan 9, 2025

‼️ This code review is generated by a bot. Please verify the content before trusting it.

Here are a few observations and potential issues from the provided git diff output:

  1. Unused Variable s and p in eachi Function:

    • In the original code, the variables s (a stack) and p (a pointer to the current node) were used for an iterative traversal of the tree. However, in the modified code, these variables are no longer used because the traversal has been replaced with a recursive dfs function. This is not necessarily a bug, but it's worth noting that the unused variables could be removed to clean up the code.
  2. Potential Stack Overflow in Recursive dfs Function:

    • The modified code replaces the iterative traversal with a recursive dfs function. While this makes the code cleaner, it could lead to a stack overflow if the tree is very deep (e.g., a degenerate tree where each node has only one child). This is a potential issue, especially for large datasets. Consider using an iterative approach if the tree depth is a concern.
  3. Inefficient Index Increment in Recursive dfs Function:

    • The index i is incremented (i = i + 1) after processing the left subtree and before processing the right subtree. While this works, it might be more efficient to pass the index as a parameter to the dfs function instead of using a mutable variable. This would make the function more functional and avoid potential issues with mutable state.

These are the key observations from the diff. The changes seem to simplify the code, but they introduce potential risks related to recursion depth and mutable state.

@illusory0x0 illusory0x0 force-pushed the improve-eachi-sorted-set branch from 84ab135 to 5d98c8a Compare January 9, 2025 09:03
@peter-jerry-ye
Copy link
Collaborator

Potential Stack Overflow:

The modified code uses a recursive depth-first search (dfs) function to traverse the tree. While this approach is more concise, it could lead to a stack overflow for very large trees due to deep recursion. The original iterative approach using a stack (s) would be more memory-efficient and avoid this risk.

@hackwaly
Copy link
Contributor

hackwaly commented Jan 9, 2025

The depth never be too large since we are using balanced binary tree. I think dfs is acceptable.

Another question: Do we really need eachi as we have iter already.

@illusory0x0
Copy link
Contributor Author

self.size < 10000,use this impl, otherwise use original impl

The depth never be too large since we are using balanced binary tree. I think dfs is acceptable.

Another question: Do we really need eachi as we have iter already.

@coveralls
Copy link
Collaborator

coveralls commented Jan 9, 2025

Pull Request Test Coverage Report for Build 4688

Details

  • 3 of 3 (100.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.006%) to 82.991%

Totals Coverage Status
Change from base Build 4687: 0.006%
Covered Lines: 4762
Relevant Lines: 5738

💛 - Coveralls

@bobzhang bobzhang requested a review from hackwaly January 10, 2025 03:53
@bobzhang bobzhang enabled auto-merge (rebase) January 10, 2025 08:35
@bobzhang bobzhang force-pushed the improve-eachi-sorted-set branch from 5d98c8a to ac12c33 Compare January 10, 2025 08:35
@bobzhang bobzhang merged commit 96d1c11 into moonbitlang:main Jan 10, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants