Replies: 3 comments 5 replies
-
If the type derivations are to live outside of a Substrait expression, then in the general case, I beleive you would need to define a "simple" type derivation grammar for function signatures (for example, |
Beta Was this translation helpful? Give feedback.
-
A similar (but maybe easier to handle) case is a kernel like Come to think of it, do we need to state the fact that all types are potentially nullable anywhere? (hopefully that isn't contentious) |
Beta Was this translation helpful? Give feedback.
-
It may not add much but for completeness sake I'd argue there may be some cases where parameterization is required to properly define input types as well. For example:
One of the more bizarre kernels in Arrow at the moment is case_when which is roughly defined as...
...although I'm not sure why it can't be...
Even with this form though, parameterization is still arguably needed to ensure all of the |
Beta Was this translation helpful? Give feedback.
-
Functions that work entirely with simple (non-compound) types typically have very simple semantics. For example: add(i32, i32) => i32. These kinds of functions have what is declared in function signatures as "direct" output type. For compound types, derivation strategies can be more complex. A couple of examples:
A function such as
func(List<T>) =>T
A function such as
func(decimal(p1, s1), decimal(p2,s2)) => decimal(max(p1+p2,38),min(s1,s2))
To correctly validate plans, the function signatures and the Substrait specification must support these arbitrary output type derivation strategies. As such, we need some way to define the arbitrary logic associated with type derivation so that any time a function is used in a plan, the output type is clearly specified/determinate. So the question is: what is the best way to support this to be portable across systems in as simple a way as possible?
Beta Was this translation helpful? Give feedback.
All reactions