dramatic loss of timestamp accuracy! #11

randomgambit · 2020-03-04T19:00:24Z

Hi @hannesmuehleisen I think I have found a quite severe bug
Consider this example in Python:

import pyarrow as pa
import pandas as pd
import numpy as np

mydf = pd.DataFrame({'mytime' : [pd.to_datetime('2020-01-01 10:10:10.123456'),
                                pd.to_datetime('2020-01-01 10:10:10.234567')],
                     'value' : [1,2]})

mydf.head()
Out[137]: 
                      mytime  value
0 2020-01-01 10:10:10.123456      1
1 2020-01-01 10:10:10.234567      2

#now writing to parquet file
mydf.to_parquet('testfile_spark.pq', engine = 'pyarrow', flavor = 'spark')

Now reading the file in Python works fine

one = pd.read_parquet('testfile_spark.pq')

one.head()
Out[134]: 
                      mytime  value
0 2020-01-01 10:10:10.123456      1
1 2020-01-01 10:10:10.234567      2

Unfortunately, reading the file in R using miniparquet floors the timestamp at the second level.

> mymini <- read_parquet('testfile_spark.pq')
> mymini
               mytime value
1 2020-01-01 10:10:10     1
2 2020-01-01 10:10:10     2

What do you think?

Thanks!

The text was updated successfully, but these errors were encountered:

randomgambit · 2020-03-16T16:17:16Z

@hannesmuehleisen are you there? please let me know if you are not interested anymore in maintaining this great package! thx!

hannes · 2020-03-17T08:19:40Z

Yes I'm there we will circle back to miniparquet. I am currently working to support nested tables in DuckDB, which will then also come to miniparquet. Also happy to review a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dramatic loss of timestamp accuracy! #11

dramatic loss of timestamp accuracy! #11

randomgambit commented Mar 4, 2020

randomgambit commented Mar 16, 2020

hannes commented Mar 17, 2020

dramatic loss of timestamp accuracy! #11

dramatic loss of timestamp accuracy! #11

Comments

randomgambit commented Mar 4, 2020

randomgambit commented Mar 16, 2020

hannes commented Mar 17, 2020