-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/slp vectorization #69
Conversation
# Conflicts: # compiler/src/driver/action/EmitObjectCode.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@csvtuda: Thanks for the contribution, the code mostly looks very good!
I've annotated a few nits directly in the code.
Some additional high-level points:
- Automated testing currently fails due to the time limit. Is it expected that the new Python-tests for SLP-vectorizer take multiple hours to complete? If so, I would prefer to stick to tests that complete within minutes.
- While the tests ensure that the compiler does not fail compilation and the computed result is correct, they do not test that vectorization actually happens. Maybe it would make sense to add a few tests that use
spnc-opt
to run the SLP vectorizer and check that the output actually contains vectorized code/MLIR. - What is the support status of Categorical and Histogram in the SLP vectorizer?
- The
LoSPNtoCPUStructureConversionPass
uses quite a lot of options now. Maybe we could leverage the MLIR built-in support for pass options (https://mlir.llvm.org/docs/PassManagement/#instance-specific-pass-options), which would probably make the registration inspnc-opt
much easier. I'm aware that the remaining passes also do not yet use this infrastructure, but maybe this pass is a good starting point to see if we can leverage this infrastructure.
mlir/include/Conversion/LoSPNtoCPU/Vectorization/SLP/SLPPatternMatch.h
Outdated
Show resolved
Hide resolved
mlir/lib/Conversion/LoSPNtoCPU/Vectorization/SLP/GraphConversion.cpp
Outdated
Show resolved
Hide resolved
Fixed in df06782. Running the test cases with real examples should only require up to ten minutes now instead of tens of hours. If desired, I can also use real examples that each take mere seconds to complete. |
Addressed in 42a2557. Using the pass-specific options allowed me to remove the option declarations in spnc-opt. Please note that a 'big' constructor is still required for accepting the driver's options though. |
They aren't supported yet (i.e. they won't be vectorized) since I've never encountered them during development. But using vectorized select operations (similar to this PR) might be possible, I think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updating the PR!
Currently, the Python test scripts still seem to make unnecessary copies of the kernel file.
There are two other issues that I would still like to adress before merging:
- There should be one CMake-Option to turn off all the SLP-related debug/timing prints. You can define a CMakeOption (e.g.
SLP_DEBUG
) and usetarget_compile_definitions
to pass its value to the code and use it in the definition ofPRINT_SIZE
and similar. - There should be at least one or two
llvm-lit
based tests to check that the SLP-vectorizer actually performs vectorization. Those tests can be inspired bymlir/test/lowering/lospn-to-cpu/lower-to-cpu-structure-batch-vectorize.mlir
, where a small input graph undergoes the lowering pass andFileCheck
is used to check that vectorization has happened. The current Python-based tests only cover the basic functionality (e.g. compiler is not broken/compilation fails), but they do not check if vectorization is actually performed.
python-interface/test/examples/fashion-mnist/test_vector_fashion-mnist.py
Outdated
Show resolved
Hide resolved
I opened #70 as separate issue for supporting those. |
I added a test case in d16b1a1 which is based on a small graph consisting of gaussian inputs, constants and multiplications. It requires reordering of three vectors for 'perfect' SLP vectorization (i.e. operation-uniform vectors). The test verifies that such a reordering has taken place. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the remaining issues!
Merging this pull request would add SLP vectorization of SPNs to the project.