Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/slp vectorization #69

Merged
merged 395 commits into from
Nov 19, 2021
Merged

Feature/slp vectorization #69

merged 395 commits into from
Nov 19, 2021

Conversation

csvtuda
Copy link
Collaborator

@csvtuda csvtuda commented Nov 12, 2021

Merging this pull request would add SLP vectorization of SPNs to the project.

csvtuda added 30 commits May 4, 2021 18:45
# Conflicts:
#	compiler/src/driver/action/EmitObjectCode.cpp
#	compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.cpp
#	compiler/src/pipeline/steps/mlir/conversion/LoSPNtoCPUConversion.h
Copy link
Member

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@csvtuda: Thanks for the contribution, the code mostly looks very good!

I've annotated a few nits directly in the code.

Some additional high-level points:

  • Automated testing currently fails due to the time limit. Is it expected that the new Python-tests for SLP-vectorizer take multiple hours to complete? If so, I would prefer to stick to tests that complete within minutes.
  • While the tests ensure that the compiler does not fail compilation and the computed result is correct, they do not test that vectorization actually happens. Maybe it would make sense to add a few tests that use spnc-opt to run the SLP vectorizer and check that the output actually contains vectorized code/MLIR.
  • What is the support status of Categorical and Histogram in the SLP vectorizer?
  • The LoSPNtoCPUStructureConversionPass uses quite a lot of options now. Maybe we could leverage the MLIR built-in support for pass options (https://mlir.llvm.org/docs/PassManagement/#instance-specific-pass-options), which would probably make the registration in spnc-opt much easier. I'm aware that the remaining passes also do not yet use this infrastructure, but maybe this pass is a good starting point to see if we can leverage this infrastructure.

@csvtuda
Copy link
Collaborator Author

csvtuda commented Nov 17, 2021

Automated testing currently fails due to the time limit. Is it expected that the new Python-tests for SLP-vectorizer take multiple hours to complete? If so, I would prefer to stick to tests that complete within minutes.

Fixed in df06782. Running the test cases with real examples should only require up to ten minutes now instead of tens of hours. If desired, I can also use real examples that each take mere seconds to complete.

@csvtuda
Copy link
Collaborator Author

csvtuda commented Nov 18, 2021

The LoSPNtoCPUStructureConversionPass uses quite a lot of options now. Maybe we could leverage the MLIR built-in support for pass options (https://mlir.llvm.org/docs/PassManagement/#instance-specific-pass-options), which would probably make the registration in spnc-opt much easier.

Addressed in 42a2557. Using the pass-specific options allowed me to remove the option declarations in spnc-opt. Please note that a 'big' constructor is still required for accepting the driver's options though.

@csvtuda
Copy link
Collaborator Author

csvtuda commented Nov 18, 2021

What is the support status of Categorical and Histogram in the SLP vectorizer?

They aren't supported yet (i.e. they won't be vectorized) since I've never encountered them during development. But using vectorized select operations (similar to this PR) might be possible, I think.

Copy link
Member

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updating the PR!

Currently, the Python test scripts still seem to make unnecessary copies of the kernel file.

There are two other issues that I would still like to adress before merging:

  • There should be one CMake-Option to turn off all the SLP-related debug/timing prints. You can define a CMakeOption (e.g. SLP_DEBUG) and use target_compile_definitions to pass its value to the code and use it in the definition of PRINT_SIZE and similar.
  • There should be at least one or two llvm-lit based tests to check that the SLP-vectorizer actually performs vectorization. Those tests can be inspired by mlir/test/lowering/lospn-to-cpu/lower-to-cpu-structure-batch-vectorize.mlir, where a small input graph undergoes the lowering pass and FileCheck is used to check that vectorization has happened. The current Python-based tests only cover the basic functionality (e.g. compiler is not broken/compilation fails), but they do not check if vectorization is actually performed.

@sommerlukas
Copy link
Member

What is the support status of Categorical and Histogram in the SLP vectorizer?

They aren't supported yet (i.e. they won't be vectorized) since I've never encountered them during development. But using vectorized select operations (similar to this PR) might be possible, I think.

I opened #70 as separate issue for supporting those.

@csvtuda
Copy link
Collaborator Author

csvtuda commented Nov 19, 2021

There should be at least one or two llvm-lit based tests to check that the SLP-vectorizer actually performs vectorization.

I added a test case in d16b1a1 which is based on a small graph consisting of gaussian inputs, constants and multiplications. It requires reordering of three vectors for 'perfect' SLP vectorization (i.e. operation-uniform vectors). The test verifies that such a reordering has taken place.

Copy link
Member

@sommerlukas sommerlukas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the remaining issues!

@sommerlukas sommerlukas merged commit a20bd3b into develop Nov 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants