Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to save Duration tensors to Parquet #4

Open
sydduckworth opened this issue Jun 26, 2023 · 0 comments
Open

Unable to save Duration tensors to Parquet #4

sydduckworth opened this issue Jun 26, 2023 · 0 comments
Labels
bug Something isn't working

Comments

@sydduckworth
Copy link
Collaborator

arrow-rs doesn't support converting Duration arrays to/from Parquet format. To get around this, Synapse uses a hack where Duration arrays are cast to int64 before writing to disk, and when reading from disk a Cast node is inserted into the execution plan to convert back to duration.

This works for scalar arrays, but not for Duration tensors, which are represented as fixed-size lists, since the arrow-rs cast method doesn't support fixed-sized lists.

This should be fixed upstream in the near future since there is significant work being done in datafusion to better support fixed-size lists:
apache/datafusion#6560

In the meantime only scalar Duration fields are supported.

@sydduckworth sydduckworth added the bug Something isn't working label Jun 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant