-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Datetime units #782
Datetime units #782
Conversation
@joshmoore @MSanKeys963 If you have a moment I wonder if you could comment on plans for datetime data types in Zarr v3. At SciPy 2023 I remarked to Josh that they appeared to be missing from the spec and I think he suggested that they might be reintroduced. It does seem to be common in xarray examples, particularly climate examples. The term |
0fb2c20
to
e68e0e5
Compare
@danielballan: thanks for the ping. Unfortunately, I don't have any updates on |
tiled/server/pydantic_array.py
Outdated
# e.g. `'<M8[ns]'` has `units = 'ns'`. | ||
units = "" | ||
if dtype.kind in ("m", "M"): | ||
if res := re.search(r"\[(.+)\]$", dtype.str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely there is a better way than running a regex on the string version of the dtype ?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried every attribute of dtype, but couldn't find it. I really don't like the regex solution, but this is the best I could come up with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can not dig this out, I think it would be better to go with units = {dtype_obj: 'string', ..}[dtype]
where we directly map the numpy dtype to the string we want for it. A bit of boilerplate for us, but it is clear what is going on and protects us from most of the things I can think of that numpy could change that would mess with us (and it does not user their str repr as a critcal API).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would mean we could also pick some units not to support https://numpy.org/doc/stable/reference/arrays.datetime.html#datetime-units
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turns out, there is a helper utility function in numpy to extract units from datetime
dtypes, and the units can be even more complicated. Thank you, @danielballan for pointing this out! This is probably the cleanest solution, and I hope numpy wouldn't change this api. I don't think we need to keep the units (strings) and counts (int) separately, though -- a compound attribute is enough -- but keen to hear your thoughts on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is something I would expect them to keep stable.
0be5e43
to
a742bd9
Compare
This expands
BuiltinDtype
by adding explicit units for numpydatetime64
andtimedelta64
dtypes, e.g.datetime64[ns]
. This is required, for unambiguous representation of 8-byte datetime objects, specifically with zarr.The
BuiltinDtype
class now defines an additional attribute,units
, which is""
by default. The units are extracted from (and added to) the string representation ofnumpy.dtype
, e.g.<M8[ns]
.More information of supported units can be found here.
Checklist