Feature/slp vectorization #69

csvtuda · 2021-11-12T16:56:25Z

Merging this pull request would add SLP vectorization of SPNs to the project.

# Conflicts: # compiler/src/driver/action/EmitObjectCode.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.h

sommerlukas

@csvtuda: Thanks for the contribution, the code mostly looks very good!

I've annotated a few nits directly in the code.

Some additional high-level points:

Automated testing currently fails due to the time limit. Is it expected that the new Python-tests for SLP-vectorizer take multiple hours to complete? If so, I would prefer to stick to tests that complete within minutes.
While the tests ensure that the compiler does not fail compilation and the computed result is correct, they do not test that vectorization actually happens. Maybe it would make sense to add a few tests that use spnc-opt to run the SLP vectorizer and check that the output actually contains vectorized code/MLIR.
What is the support status of Categorical and Histogram in the SLP vectorizer?
The LoSPNtoCPUStructureConversionPass uses quite a lot of options now. Maybe we could leverage the MLIR built-in support for pass options (https://mlir.llvm.org/docs/PassManagement/#instance-specific-pass-options), which would probably make the registration in spnc-opt much easier. I'm aware that the remaining passes also do not yet use this infrastructure, but maybe this pass is a good starting point to see if we can leverage this infrastructure.

compiler/src/option/GlobalOptions.cpp

mlir/include/Conversion/LoSPNtoCPU/Vectorization/SLP/GraphConversion.h

mlir/include/Conversion/LoSPNtoCPU/Vectorization/SLP/SLPPatternMatch.h

mlir/include/Conversion/LoSPNtoCPU/Vectorization/SLP/Util.h

mlir/lib/Conversion/LoSPNtoCPU/LoSPNtoCPUConversionPasses.cpp

mlir/lib/Conversion/LoSPNtoCPU/Target/TargetInformation.cpp

mlir/lib/Conversion/LoSPNtoCPU/Vectorization/SLP/GraphConversion.cpp

runtime/src/Executable.cpp

csvtuda · 2021-11-17T19:37:27Z

Automated testing currently fails due to the time limit. Is it expected that the new Python-tests for SLP-vectorizer take multiple hours to complete? If so, I would prefer to stick to tests that complete within minutes.

Fixed in df06782. Running the test cases with real examples should only require up to ten minutes now instead of tens of hours. If desired, I can also use real examples that each take mere seconds to complete.

csvtuda · 2021-11-18T03:22:04Z

The LoSPNtoCPUStructureConversionPass uses quite a lot of options now. Maybe we could leverage the MLIR built-in support for pass options (https://mlir.llvm.org/docs/PassManagement/#instance-specific-pass-options), which would probably make the registration in spnc-opt much easier.

Addressed in 42a2557. Using the pass-specific options allowed me to remove the option declarations in spnc-opt. Please note that a 'big' constructor is still required for accepting the driver's options though.

…int.

…ant patterns.

csvtuda · 2021-11-18T03:36:28Z

What is the support status of Categorical and Histogram in the SLP vectorizer?

They aren't supported yet (i.e. they won't be vectorized) since I've never encountered them during development. But using vectorized select operations (similar to this PR) might be possible, I think.

sommerlukas

Thanks for the updating the PR!

Currently, the Python test scripts still seem to make unnecessary copies of the kernel file.

There are two other issues that I would still like to adress before merging:

There should be one CMake-Option to turn off all the SLP-related debug/timing prints. You can define a CMakeOption (e.g. SLP_DEBUG) and use target_compile_definitions to pass its value to the code and use it in the definition of PRINT_SIZE and similar.
There should be at least one or two llvm-lit based tests to check that the SLP-vectorizer actually performs vectorization. Those tests can be inspired by mlir/test/lowering/lospn-to-cpu/lower-to-cpu-structure-batch-vectorize.mlir, where a small input graph undergoes the lowering pass and FileCheck is used to check that vectorization has happened. The current Python-based tests only cover the basic functionality (e.g. compiler is not broken/compilation fails), but they do not check if vectorization is actually performed.

python-interface/test/examples/fashion-mnist/test_vector_fashion-mnist.py

python-interface/test/examples/plants/test_vector_plants.py

sommerlukas · 2021-11-18T13:50:34Z

What is the support status of Categorical and Histogram in the SLP vectorizer?

They aren't supported yet (i.e. they won't be vectorized) since I've never encountered them during development. But using vectorized select operations (similar to this PR) might be possible, I think.

I opened #70 as separate issue for supporting those.

csvtuda · 2021-11-19T02:48:35Z

There should be at least one or two llvm-lit based tests to check that the SLP-vectorizer actually performs vectorization.

I added a test case in d16b1a1 which is based on a small graph consisting of gaussian inputs, constants and multiplications. It requires reordering of three vectors for 'perfect' SLP vectorization (i.e. operation-uniform vectors). The test verifies that such a reordering has taken place.

sommerlukas

Thanks for addressing the remaining issues!

csvtuda added 30 commits May 4, 2021 18:45

Remove recursive postOrder, simplify node vector structure.

0b34497

Fix: postOrder computation containing duplicates.

1210a57

Clean SLP graph builder.

43f6b32

Add nlts & fashion-mnist tests.

814a222

Fix: SLP graph builder building SLP nodes more than once.

6efba13

Speed up postOrder computation.

b1ab2d4

Speed up postOrder computation even more.

b0e9833

Add basic cost model draft.

bdca6de

Rename SLP files.

4528de2

Separate postOrder from SLPNode.

ee2e686

Update tests, remove outdated .mlir files.

955235f

Prevent graph builder from including illegal vectors.

3726c1d

Remove const modifier from value getters.

9df0eb5

WIP optimal conversion insertion point computation.

dc1e0c3

Rework operation sorting prior to conversion.

f199f2d

Iterate through all vectors only once during conversion.

8b24a31

Improve non-uniform vector handling.

183758e

Fix: variable shadowing.

584fbbe

Add constant duplication detection.

c7cfd9a

Add PatternRewriter member to ConversionManager.

a21af1e

Simplify bookkeeping of created vectors.

490b3cc

Add timing and debug info.

399200f

Rename NodeVector to ValueVector.

e4695a6

Fix extremely slow map key comparisons during graph building.

67af15f

Replace escaping users with extracted value only instead of all users.

621ebea

Fix: extremely slow insertion point computation during graph conversion.

d46c158

Rename IO method, improve output during graph conversion.

a555524

Improve progress output.

838c278

Fix: sort escaping users correctly after reordering.

631e4fd

Add trivial case to later() check.

8828402

csvtuda added 2 commits November 12, 2021 17:38

Improve vectorization test error messages.

832d336

Merge branch 'develop' into feature/slp-vectorization

e5e421d

# Conflicts: # compiler/src/driver/action/EmitObjectCode.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.cpp # compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.h

csvtuda requested a review from sommerlukas November 12, 2021 16:56

sommerlukas requested changes Nov 14, 2021

View reviewed changes

csvtuda added 6 commits November 15, 2021 12:33

Fix: use default keyword instead of explicit constructor.

e19e431

Fix: make cost model unique and provide raw pointers for access.

a98cc6a

Fix: includes of TargetInformation.cpp.

124a9e4

Fix: callback loop handling.

f087c57

Fix: change total number of vectorization attempts to 1.

3f65e78

Fix: replace test cases that took hours to complete with smaller ones.

df06782

Replace spnc-opt options with pass-specific options.

42a2557

csvtuda added 3 commits November 18, 2021 04:25

Fix: allow casting gaussian input vectors from integer to floating po…

c643e83

…int.

Add comments to utility methods.

3be1bbe

Clean up vectorization patterns by combining constant and lospn const…

7a88a02

…ant patterns.

sommerlukas requested changes Nov 18, 2021

View reviewed changes

python-interface/test/examples/fashion-mnist/test_vector_fashion-mnist.py Outdated Show resolved Hide resolved

python-interface/test/examples/plants/test_vector_plants.py Outdated Show resolved Hide resolved

csvtuda added 3 commits November 18, 2021 15:03

Fix: remove unnecessary kernel copies.

59632c8

Move SLP options from util file to individual classes.

5b6da11

Add SLP vectorization test.

d16b1a1

csvtuda and others added 5 commits November 19, 2021 12:17

Move liveness analysis & output to helper function.

a7943b5

Update python tests.

f9cb4dc

Add CMake option to control SLP vectorizer debug output;

f9a5a32

Fix a few clang-tidy warnings;

64e3bde

Remove verbose output from Python tests;

2828702

sommerlukas approved these changes Nov 19, 2021

View reviewed changes

sommerlukas merged commit a20bd3b into develop Nov 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/slp vectorization #69

Feature/slp vectorization #69

csvtuda commented Nov 12, 2021

sommerlukas left a comment

csvtuda commented Nov 17, 2021 •

edited

Loading

csvtuda commented Nov 18, 2021 •

edited

Loading

csvtuda commented Nov 18, 2021

sommerlukas left a comment

sommerlukas commented Nov 18, 2021

csvtuda commented Nov 19, 2021 •

edited

Loading

sommerlukas left a comment

Feature/slp vectorization #69

Feature/slp vectorization #69

Conversation

csvtuda commented Nov 12, 2021

sommerlukas left a comment

Choose a reason for hiding this comment

csvtuda commented Nov 17, 2021 • edited Loading

csvtuda commented Nov 18, 2021 • edited Loading

csvtuda commented Nov 18, 2021

sommerlukas left a comment

Choose a reason for hiding this comment

sommerlukas commented Nov 18, 2021

csvtuda commented Nov 19, 2021 • edited Loading

sommerlukas left a comment

Choose a reason for hiding this comment

csvtuda commented Nov 17, 2021 •

edited

Loading

csvtuda commented Nov 18, 2021 •

edited

Loading

csvtuda commented Nov 19, 2021 •

edited

Loading