-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto Inferencing date(time) columns #697
Comments
Hi @aborruso , for the We went for a "relaxed" default setting when inferring date/date-time when generating JSONschema - see this discussion for more context. Regardless, I expanded As for |
For the parquet conversation we are using the arrow csv library and its date handling is not great. It only allows one date format per file and expects all dates in that file to be that format. The |
Perhaps, we can just cite the arrow csv date format limitation and suggest the user normalize the dates first with the In addition, if you're planning to do this in a pipeline, create a JSONschema with That is, until the |
@aborruso @jqnatividad I finally worked out exactly what date formats the arrow library will accept by default and for those cases the date type for parquet should work. They are
This will be fixed when #737 is merged. |
thank you very much @kindly |
Closing this now that #736 has been merged. Did some quick tests and am currently creating tests. |
@jqnatividad this test file may be useful for dateformats https://github.com/kindly/csvs_convert/blob/main/fixtures/parquet_date.csv |
Hi,
if I import in duckdb this input CSV file
I get this schema, by automatic inferencing:
And if I apply
datefmt
to this input CSV I have a right datetime fieldBut if I use
schema
orto
(to create a parquet file) to the original file, thedate1
field is mapped asstring
field.It would be great to have auto inferencing of datetime fields, also when they are not written perfectly (in my input I do not have the
T
and I have a space).Thank you
The text was updated successfully, but these errors were encountered: