Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAPHNE-#77] Avoid stack overflow in long-running DaphneDSL loops. #820

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

pdamme
Copy link
Collaborator

@pdamme pdamme commented Sep 3, 2024

  • As described in DaphneDSL loops crash after a certain number of iterations #77, DaphneDSL loops with many iterations (e.g., >100k) crash DAPHNE.
  • The reason is that LowerToLLVMPass creates LLVM::AllocaOp in several places and for various purposes.
  • These AllocaOps could also be inside a loop, which means that the memory they allocate piles up until the stack is exhausted, because it is only freed at the end of the "scope" (function).
  • This commit solves the problem by ensuring that all AllocaOps created in LowerToLLVMPass are inserted at the beginning of the function surrounding the currently considered operation.
  • With that, there are no AllocaOps in loops anymore.
  • Note that the memory allocated by the AllocaOps can safely be reused by repeated kernel calls in different loop iterations.
  • Added script-level test cases employing loops with many iterations.
  • This bug has existed for a long time (partly because it wasn't a blocking issue until recently).
  • Closes DaphneDSL loops crash after a certain number of iterations #77.

I think, in the future, we could even reduce the number of AllocaOps by making different kernel calls use the same memory slots for their result pointers. However, the most important goal for now is to get this bug fixed on main.

- As described in #77, DaphneDSL loops with many iterations (e.g., >100k) crash DAPHNE.
- The reason is that LowerToLLVMPass creates LLVM::AllocaOp in several places and for various purposes.
- These AllocaOps could also be inside a loop, which means that the memory they allocate piles up until the stack is exhausted, because it is only freed at the end of the "scope" (function).
- This commit solves the problem by ensuring that all AllocaOps created in LowerToLLVMPass are inserted at the beginning of the function surrounding the currently considered operation.
- With that, there are no AllocaOps in loops anymore.
- Note that the memory allocated by the AllocaOps can safely be reused by repeated kernel calls in different loop iterations.
- Added script-level test cases employing loops with many iterations.
- This bug has existed for a long time (partly because it wasn't a blocking issue until recently).
  - Thanks to @corepointer and @philipportner for investigating the causes (this fix is based upon their insights).
  - Thanks to @J-Brest and @borkob for providing a work-around for DAPHNE users.
- Closes #77.
@pdamme pdamme added the bug A mistake in the code. label Sep 3, 2024
@corepointer
Copy link
Collaborator

Awesome @pdamme
Will test it right away 👍

@corepointer corepointer merged commit b294230 into main Sep 5, 2024
2 checks passed
@corepointer
Copy link
Collaborator

I've been playing around with various memory profilers to test the issue and the solution. Stack usage looks quite normal with your changes applied. Also the explain_llvm output shows the improvements nicely. Thx for fixing this long standing bug at last @pdamme

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A mistake in the code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DaphneDSL loops crash after a certain number of iterations
2 participants