-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalar UDF support (with arrow support and function overloading) #407
base: main
Are you sure you want to change the base?
Conversation
7cc172d
to
aebc114
Compare
Thanks for the PR! I need to fix CI for this repo but I'm a little swamped at the moment. I will get to it ASAP |
Appreciate it @samansmink, lmk if you have any questions |
@samansmink any chance you'll get to this soon? Thanks! |
@samansmink just following up here :) |
I was able to use this PR to make a scalar udf that accepted an array type as the input. However, I ran into a limitation with Lists (not related to this PR). The ListVector type doesn't expose the entries vector of offset/vector values, and it wasn't possible to understand how to split up the child vector. |
This PR adds scalar UDF support via the C API, for functions that want to work with the data chunk or arrow types. Also includes support for function overloading via the function set API.
Would be ✨ lovely ✨ if the scalar (and aggregate function) API supported zero-copy arrow FFI API like the query results do. Right now we have to allocate record batches before using them.
It would also be nice to provide a safer way to get values from vectors based on the duck vector's datatype. This is doable without changes to the C API, but this PR was getting a little too meaty as is so I might follow up with that improvement in a separate PR.