Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No more workarounds for string ops, enabled through guarded instantiation of unary/binary ops #795

Merged
merged 2 commits into from
Sep 2, 2024

Conversation

pdamme
Copy link
Collaborator

@pdamme pdamme commented Jul 28, 2024

This PR is a step towards extensible value types in DAPHNE. Ultimately, we want to support a rich set of custom value types and reuse/instantiate the existing kernels for them without extra implementation effort. Enabling this requires, besides other things, making some pieces of the code base more general.

This PR addresses one concrete example: elementwise unary and binary operations. Currently one quickly runs into C++ compilation problems when instantiating these kernels for non-numeric value types (e.g., string or any custom non-numeric value type in the future).

The PR contains two commits (see the commit messages for more details):

  • The first one introduces compile-time info that specifies which unary/binary op codes should work with which value types.
  • The second one removes the existing workarounds for string ops (StringEqOp, ConcatOp) and uses the originally intended elementwise binary ops (EwEqOp, EwConcatOp) instead. Doing that becomes possible through the first commit.

The commits in this PR are intended to be rebased and merged, to better reflect the individual changes.

As this PR makes some general (but simple) changes to the elementwise unary/binary kernels, I'd like to make everyone interested aware. Comments are welcome, but I'm not asking for a detailed review. This PR can be merged after the v0.3 release.

@pdamme pdamme changed the title Guarded instantiation of unary/binary ops and no more workarounds for string ops No more workarounds for string ops, enabled through guarded instantiation of unary/binary ops Jul 28, 2024
- Motivation:
  - DAPHNE supports various elementwise unary/binary operations as well as various value types.
  - Not all combinations of unary/binary operations and argument/result value types are meaningful.
    - For instance, one cannot calculate the square root of a string.
  - So far, this was not a severe problem, since we have mostly numeric value types and meaningful operations thereon.
  - However, we already had to introduce workarounds for equality and concat on strings.
    - More precisely, we use StringEqOp instead of EwEqOp and ConcatOp instead of EwConcatOp for strings, which is inconsistent with the rest of DaphneIR.
  - In the future, this problem will get amplified in the context of extensibility for value types, where we want to be able to add custom value types and allow them to reuse the existing elementwise unary/binary kernels, but only for some ops.
- Technical problem:
  - The functions getEwUnaryScaFuncPtr()/getEwBinaryScaFuncPtr() are used in the elementwise unary/binary kernels.
  - These functions get the op code as a runtime parameter, switch-case on it, and in each case, instantiate the kernel with the op code as a template parameter.
  - However, this implies that these functions can only be instantiated with value types that support *all* unary/binary operations.
- Solution:
  - Introduced C++ compile-time information specifying which unary/binary operations should be supported on which combinations of argument/result value types.
  - Technically, this is achieved through the boolean template variables supportsUnaryOp/supportBinaryOp, which reside in UnaryOpCode.h/BinaryOpCode.h.
  - These variables are used to guard the kernel instantiations in getEwUnaryScaFuncPtr()/getEwBinaryScaFuncPtr() through if-constexpr.
  - Note that this mechanism is orthogonal/complementary to the specifications in kernels.json.
- Additional contributions:
  - Restructuring and harmonization of UnaryOpCode.h and BinaryOpCode.h
    - for each of them we have (replace X by un/bin)
      - enum XaryOpCode
      - std::string_view Xary_op_codes[]
      - template<..., XaryOpCode op> static constexpr bool supportsXaryOp = false;
      - helper macros for concisely specifying which ops should be supported on which value types
  - Improved some related error messages.
- So far, this is only a refactoring, a follow-up commit will make use of the changes to get rid of the workarounds for StringEqOp and ConcatOp, as one concrete example of the usefulness.
- Motivation:
  - Originally, we intended to express string equality (like equality for any value type) through EwEqOp and string concatenation through EwConcatOp.
  - However, due to problems when instantiating elementwise binary kernels with the string value type, we introduced two new operations StringEqOp and ConcatOp as workarounds some while ago.
  - These workarounds are problematic in the long term because:
    - they are not consistent with the other elementwise unary/binary DaphneIR operations, and
    - in the future, we want to support more value types (through extensibility), which would require more workarounds of this kind
  - In fact, the underlying technical problem was fixed in a recent commit ("Guarded instantiation of unary/binary ops.").
- This commit removes the workarounds for string ops by enabling string equality through EwEqOp (as for any other value type) and string concatenation through EwConcatOp.
- Concrete changes:
  - DaphneDSL parser:
    - Creates EwEqOp and EwConcatOp instead of the workaround ops now.
  - DaphneIR:
    - Removed StringEqOp and ConcatOp, we use EwEqOp and EwConcatOp instead now.
  - DAPHNE compiler:
    - Removed constant folding of StringEqOp and ConcatOp, this functionality is now achieved through the constant folding of EwEqOp and EwConcatOp.
    - Removed the rewrite from EwEqOp to StringEqOp when the value type is string, we can use EwEqOp end-to-end now.
  - DAPHNE runtime:
    - Added a new binary op code CONCAT.
    - Specified that EQ and CONCAT should be supported on string value types.
    - Removed the stringEq and concat-kernels, their essential code was moved to the ewBinarySca-kernel.
    - Removed the instantiations of the stringEq and concat-kernels from kernels.json.
    - Added new instantiations of the ewBinary-kernels for string value type and the CONCAT op code to kernels.json.
  - test cases:
    - Removed the codegen test case "stringeq.mlir", since it became obsolete through the removal of StringEqOp.
@pdamme pdamme force-pushed the support-unary-binary branch from e4c41c5 to c8f910c Compare September 2, 2024 10:50
@pdamme pdamme merged commit a7226dc into main Sep 2, 2024
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant