Additional/Reworked Codegen Passes #889
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR majorly reworks codegen for
AllAgg*
andEwOps
as well as add lowering forTransposeOp
andRow/ColAgg*
.All of these passes are added to the optional MLIR codegen pipeline that can be enabled using the
--mlir-codegen
flag and offer alternative lowering of these operations toMLIR
rather than calls to precompiled C++ kernels. Currently, they only supportDenseMatrix
with dimensions that are known at compile-time and any value type (except Booleans).Except for
IdxMin
,IdxMax
which are directly lowered to affine loops andTransposeOp
which lowers to a named linalg op all passes make use of linalg GenericOps which are then lowered to affine loops in a later pass in the codegen pipeline.They convert the input DenseMatrix to a
MemRef
and create a newMemRef
for the output that is converted into a DenseMatrix.Changes:
AllAgg*Op
,Row/ColAgg*Op
,Ew*Op
andTransposeOp
(see below for details)fusion.mlir
test to lower Linalg to affine loops before applying fusion passfloor
,ceil
,round
that removes the respective ops when input type is an integer (this also simplifies codegen)kernels.json
ir/daphneir/Passes.h
Ops with new codegen:
AllAgg*Op
Sum
,Min
,Max
Row/ColAgg*Op
Sum
,Min
,Max
,IdxMin
,IdxMax
Ew*Op
Abs
,Sqrt
,Exp
,Ln
,Sin
,Cos
,Floor
,Ceil
,Round
Add
,Sub
,Mul
,Div
,Pow
,Max
,Min
TransposeOp
A small example of a lowered kernel:
The input is converted to a MemRef and a result MemRef is allocated. The first Linalg GenericOp initialized the result MemRef by copying the first row of the input and the second GenericOp iterates over the remaining values and applies the aggregation operation - an addition in this case.
Known Limitations:
LoopFusionPass
below theLinalgToAffineLoopsPass
enables some loop fusions already, but it seems to cause issues with e.g.TransposeOp
. A simple example of this isX = [1,2,3](1,); print(t(X)); print(t(t(X)));
. Hence, loop fusion has not been moved down yet.Ew*Op
broadcasting for singleton matrices currently has no canonicalizer pass to always move the singleton matrix to be therhs
operand. This should be handled separately though to take broadcasting for C++ kernels into account as well. (see Matrix broadcasting not working for 1x1 ∘ 1xn / nx1 matrices #803)MemRefType
is currently handled during conversion of the input Dense Matrix to a MemRef.RewriteToCallKernelOpPass
currently fails if IR containsmath.ipowi
or any trigonometric math op other thansin
andcos
, e.g.no kernels registered for operation 'ipowi'
. Hence, theewBinaryPow
test currently fails. Before merging this should be fixed or commented out. The same issue persists for the currently commented out lowering for trigonometric math opstan, asin, acos, atan, sinh, cosh, tanh
inEwOpsLowering.cpp
.