-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][OpenMP] Add support for multi-range do concurrent
loops
#89
Conversation
840e2dd
to
535d8d1
Compare
First thoughts without looking at the code:
|
Thanks @Meinersbur for taking a look.
Indeed that's only for the
Very ingishtful comment for me, thanks! However, the thing is that on the |
How does the FIR look like for these cases? Is the |
Ok, then please record an issue as a potential performance bug. I'd argue that the nesting should be reflected in FIR to make sure that the compiler see the merged loop vs. a nested loop. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do belong to roc-trunk-dev which GitHub does not allow me to comment on.
createMapInfoOp
and calculateTripCount
seem to have been copied from FortranLower's Utils.cpp. It fails with a linker error. Please don't copy&paste from somewhere else but find a location where it can be used from both.
Why is there a need for isOpUltimatelyConstant
? Isn't there already an MLIR utility that determines this, or a normalization pass that normalizes constant expressions to ConstantOp
?
DoConcurrentConversionPassOptions::mapTo
and DoConcurrentConversionPassBase::mapTo
should be enums.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for detailed review Michael. Handled some of you comments and looking into the rest ... 👀
createMapInfoOp
andcalculateTripCount
seem to have been copied from FortranLower'sUtils.cpp
. It fails with a linker error. Please don't copy&pase from somewhere else but find a location where it can be used from both.
I understand this is less than ideal. We agreed to do that for now and I will split these utils upstream (see #88 and the relevant Teams discussion).
Why is there a need for
isOpUltimatelyConstant
?
This is sort of guardrail that will be lifted quite soon. I added that check because I tested the code on very simple loop for a start. But as you can see in this PR, the restrictions are being lifted one by one. So this check should be just temporary.
I did that in a separate PR: #96 since this was already in-place and not directly related to the scope of this PR. |
@Meinersbur I think all of your comments are now handled. Please have another look whenever you have a chance. |
Ping 🔔! Please have a look when you have time! |
Thanks for addressing my review. I am still uncomfortable with the implicit privatization heuristics. Maybe there is already a precedent that I don't know of?
Since it doesn't even build for me, I suggest to give the copy a different name or namespace (and TODO comment), so at least they don't clash. |
Extends `do concurrent` to OpenMP mapping by adding support for multi-range loops. The current implementation only works for perfectly nested loops. So taking this input: ```fortran do concurrent(i=1:n, j=1:m) a(i,j,k) = i * j end do ``` will behave in exactly the same way as this input: ``` do concurrent(i=1:n) do concurrent(j=1:m) a(i,j,k) = i * j end do end do ```
I think all comments are handled and all the issues we discussed are captured in b447a85. Let me know if you suggest any other changes in this PR. |
Ping Ping 🔔! Anything else needs to be done here? Please take a look 🙏! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can go in. LGTM.
Extends
do concurrent
to OpenMP mapping by adding support for multi-range loops. The current implementation only works for perfectly nested loops. So taking this input:will behave in exactly the same way as this input: