UTC & leap seconds (again! again!!!) #400
Replies: 14 comments 8 replies
-
I'm all for making the docs more precise and clear, which I think is what much of this is about. But is there any of this that might cause actual problems with using the new calendar? To a few points:
Sure -- that seems a bit more correct.
I don't think that's wise. As you point out about above, UTC is not a calendar per se -- a calendar specified days, and the time within in that is kind of a separate thing. However, for CF, they are a single concept a "calendar" is the full spec for how to interpret (and produce) datetime strings. So the difference between, say the standard calendar and utc is leap seconds, that's it -- there is just as much validity to specifying a timezone offset with the UTC calendar as any other. In that case, the time is not UTC, but it is relative to UTC -- so that's fine.
exactly, so it's valid from that date, yes? what's the problem here?
I think that was pretty deliberate -- we try not to re-write specifications that others are responsible for. That being said, a bit more detail, and in particular a link to an external source with the full explanation would be great -- can you propose one?
I'm guessing that was out of ignorance ....
That sounds good to me -- but I'll let others that may know better weight in. The goal here is that people can use a UTC timestamp, convert that to a CF-style "time-since" form, (and vice versa) and clearly define what that means. If that's not possible pre-1972, then yes, we shouldn't allow it.
The issue is not that data producers don't know whether leap seconds are included in their timestamps -- the problem is that most of the software out there is not leap-second aware. So if a producer knows they have proper UTC timestamps, but their software (and that of their users) doesn't support leap seconds, then they need to use the standard calendar. Practicality beats purity, and all that.
agreed -- we should remove that.
because However, now that you mention is, I see in section 3.1.2: "A variable must not have a That needs an update. |
Beta Was this translation helpful? Give feedback.
-
Dear Patrick Thanks for your comments and the discussion.
Best wishes and happy New Year (whenever it exactly occurs for you!) Jonathan |
Beta Was this translation helpful? Give feedback.
-
Dear Jonathan, Happy New Year to you and all other collaborators on the Gregorian calendar! (For those of you on the Julian calendar, do not hesitate to fault me if I forget about your New Year in 9 days' time.) I remain rather uncomfortable with some of the arguments made in previous comments. As I understand these arguments, they are based more on ease-of-use and every-day practice rather than standards and principles. UTC is one thing and one thing only. Time zones are referenced to UTC so to call a calendar The first use-case that was made for a calendar using UTC would very unlikely be using the local time as reported by a local computer synchronising with an NTP server, and which might result in a datetime as indicated above. And that is probably the case with the large majority of data producers using the CF conventions. In climate and weather forecasting science, practitioners tend to know their clocks, and how to convert from their local time as reported by their instruments and processing environment into UTC. So, again, why the extra whistles and bells? It should also be considered that The CF calendar Best, |
Beta Was this translation helpful? Give feedback.
-
Dear Patrick Thanks for your further thoughts. You comment that the arguments "are based more on ease-of-use and every-day practice rather than standards and principles." I think that's partly true. Perhaps you've seen that the preceding discussions on the issue of leap seconds were extraordinarily long and difficult. It seemed to me that these arguments were often about what things ought to mean "in principle". Because of their different understanding of these principles, the participants in the debate could not agree, and perhaps sometimes could not understand one another properly. In this case, we could not achieve an consensus about all the principles involved. This is a very unusual outcome for CF discussions, but it's not a show-stopper, because CF is a convention i.e. an agreement, rule, practice or custom. Even though we couldn't agree on what things ought to mean, we did manage to agree some workable conventions. You're not alone in being uncomfortable about some of our choices. That's regrettable, but perhaps can't be helped. I think the important question to consider is whether there are any hazards with these conventions that are likely to cause data-producers to write incorrect metadata, or data-users to misinterpret the metadata. If there are, we should either clarify the standards document, or consider changing the conventions. This is consistent with one of the principles for design (Sect 1.2):
You wrote about some possible hazards before, and I have responded with possible remedies. Thanks for those points. Here, you're suggesting that time zones in
Timezones have always been allowed in CF datetimes. This convention was iinherited from COARDS, which contains the example in Mountain Daylight Time which you refer to. As COARDS says, that example comes from the UDUNITS documentation. Another of the CF principles is
If you have a data-logger which records local time and is leap-second-aware, the datetimes would most conveniently be expressed in CF using the In the example you give, you might have a time coordinate of 64801 with Where would the confusion arise? Do you think the data-user might ignore the timezone in the Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
I’m on vacation, so brief, but a few notes.
Here I have to disagree. The U stands for Universal, UTC time is the same
all over the world at a single instant, irrespective of where you are.
Bangkok, London and Denver all have the same UTC time at the same instant.
Of course, but the offset is with respect to UTC, and it is provided as
part of the string, so there is no confusion about what it means.
And this isn’t a CF-specific oddity — it’s part of the ISO 8601.
As you say, CF is indeed focused on “on ease-of-use and every-day practice
rather than standards and principles.”
“Practicality beats purity” as the Python folks say.
Indeed, that’s why UTC has not been supported in CF until now— very little
software (and none of the most commonly used tools) support it properly.
Note: we should try to use “time zone offset”, or “offset”, rather than
“time zone”. Everyone in this discussion knows the difference, but when
written down, we should make sure to minimize confusion.
One of the practical challenges is that a LOT of people assume UTC means
“time at the prime meridian”, I.e. with no time zone offset applied, which
it does, but it’s also more specific than that, but that distinction is
lost on many people.
That rate was adjusted three more times before 1972 …
The values may have been coincident on that day, but they immediately
started drifting apart.
Ah, thanks - then I agree— 1972+ makes better sense then.
What about civil, as a suggestion?
I think that would be even more confusing.
UTC +- offset is precise and clear and very commonly used.
…-CHB
|
Beta Was this translation helpful? Give feedback.
-
I believe we may be converging on an outcome that is both practical and standards-based; maybe even both practical and pure?
Correct, but that is local time or the common understanding of Rather than tossing out UTC as a calendar option, I would propose the following:
Would something like that work? As a final note on standards versus practice: The CF conventions are far more influential than you make it sound. The decisions reached on issues such as calendars reverberate throughout the larger CF community and materialize in data sets that hang around for decades. The |
Beta Was this translation helpful? Give feedback.
-
Hi, On the 1958-1972 issue: I think we should keep the 1958 epoch. This change in approach before/after 1972 was indeed considered during the creation of the new text, and the wording we used was deliberately chosen to allow for any frequency of leap second application (e.g. daily), and an individual adjustment being any amount of seconds (e.g. negative amounts and non-integer amounts). I think that what we have now is entirely consistent with the messy situation that occurred in 1958-1972, so there is no need to change the minimum valid time. If you can deal with leap seconds applied after 1972, then dealing with the situation before 1972 (should have such data) is logically no different. (Whilst it currently seems unlikely, the leap second situation could get more messy again in the future, in which case the conventions will still hold, and we'dbe glad we didn't arbitrarily remove part of the valid time period.) On the timezone/calendar-name issue: To me, the use of a timezone offset is merely an encoding choice in the Cheers, |
Beta Was this translation helpful? Give feedback.
-
This is more of a general comment than a response to any of the previous comments in this discussion. I think that we have to careful about what words we use in relation to all this. For example what do we mean by the word "calendar" (as a general concept) and when do we mean These are maybe silly questions. But at the same time, CF is trying to make several different "clocks" and "calendars" come together into one concept; the CF I think that CF would be well served by general discussion about all this. For example, there seems to be a strong push towards having the "datetimestamp" labelling (strings such as "2025-01-07 12:00:00Z") as a vehicle for unifying points in time across different calendars. This can be a very useful and pragmatic solution that is legitimate and accepted in many situations. But in other situations such a solution cannot be justified or even make it invalid, whereas other pragmatic solutions exist. Trying to fold any such pragmatic solution into the CF Conventions is doomed to cause trouble and limit its usefulness. (1) Little, C. (2024, September 17). OGC Temporal Domain Working Group Activities. 2024 CF Workshop, Swedish Meteorological and Hydrological Institute (SMHI), Norrköping, Sweden. https://doi.org/10.5281/zenodo.14194152. |
Beta Was this translation helpful? Give feedback.
-
Dear Lars You're right that we have to be careful about words. In the world as a whole, the words you mention may be used in various ways. If there is inconsistency, we can't help it. Within the CF convention, however, we can be careful to use our words consistently. I hope that within the convention document we always mean "calendar" and "datetime" in the senses with which they're defined in Sect 1.3. On briefly looking, I believe that "date" separately always refers to just the year-month-day part of the datetime. A "time" coordinate represents a datetime, of course, not just a time of day; there's a lot in Sect 4.4 about it! In the standards document, "stamp" occurs only once, in Appendix B; we could delete it. The word "clock" occurs only once, in Sect 4.4.1; we could replace it. The phrase "regular interval" occurs only once, in App J, where it refers to distance rather than time. Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
From my own perspective, I think that Chris Little's presentation⁽¹⁾ at
the 2024 CF Conventions Workshop was very useful, and in this context
particularly viewgraphs 17-19.
I think that CF would be well served by general discussion about all this.
Is there a freely available summary of the new ISO 34000 ?
Or do any of you have access to a copy?
It seems that could be helpful to reference.
…-CHB
|
Beta Was this translation helpful? Give feedback.
-
If you can deal with leap seconds applied after 1972, then dealing with
the situation before 1972 (should have such data) is logically no
different.
IIUC ( and I may not) it IS logically different — that is, rather than
applying a “leap second” integer or not, at particular times, there was a
slight adjustment to the size of every second in a range (did someone call
them rubber seconds?). Which is a different approach, and would require
different code to handle.
@pvanlaake: could you please confirm?
My concern about the addition of a UTC Calendar is that it is an attractive
nuisance -- a lot of people *think* they are using UTC, when they are not
(quite) -- and the software they are using may not support it properly.
Allowing pre-1972 may make it even more likely for not-really-UTC data to
be presented as UTC.
|
Beta Was this translation helpful? Give feedback.
-
“””
Non-leap-second-aware software would calculate these differences as 1 SI
second and 1 SI day respectively. Leap-second-aware software would give the
differences as 0.999999985 SI seconds
“””
Well, “rubber seconds” aren’t the same as leap seconds, though they serve a
similar purpose. With Leap seconds, seconds are always an integer.
So we *could* say the “fully UTC compliant software could manage that” it
can’t be handled with integer seconds.
In fact, it looks like you’d need finer than microseconds, so pretty
problematic.
(And single precision float wouldn’t work either).
And in any case when mapping between calendars, you couldn’t do it at all
if the otgg bf er calendar only gandkk k ed integer seconds …
So yes, it’s still a well defined transform, but from a practical
perspective it’s a pretty different beast.
Again, I’m concerned about the attractive nuisance — is there ANY software
that can handle this correctly?
So wouldn’t it be better to disallow it?
Hmm — thinking now, it might actually work better to use the standard
calendar in that case. You could use unaware software to transform back and
forth, and it would be lossless, unlike leap seconds, where that almost
works, except when there is an actual leap second, as most (all) non-leap
second aware software doesn’t allow a 60th second.
|
Beta Was this translation helpful? Give feedback.
-
CF is usually enhanced, expanded or changed based on concrete use cases. Over the years it has been argued, and accepted, that there are enough use cases to include the UTC and TAI However, concrete use cases are missing, even though there have been many references to that the satellite folks need it. And to that there are high-resolution measurement that may need this kind of precision. When I asked a well-placed EUMETSAT person I got the response that they reset their clocks at midnight (we did not go into any further details, so this is a bit of hearsay). And the only recent concrete high-frequency data collection example was from David @davidhassell, where they used UTC but was not even aware of the concept of leap seconds (David please correct me if I have mis-interpreted something). Way back I was involved in field measurements using sonic anemometers where the raw 20 Hz output was stored and the system clock was in the measurement hardware. But the final data was then reduced to 5- or 10-min covariances (etc), meaning that the second precision was not needed anymore. Similar systems today I imagine would either use NTP or GPS as clock. All in all, without concrete uses cases we are not on firm ground, and I would really like to have at least one concrete use case to help us here. |
Beta Was this translation helpful? Give feedback.
-
Dear all I think David is correct in asserting that the current text is not wrong, because in Sect 4.4.3 we say, for integer-second datetimes, "a time coordinate value expressed in seconds equals the number of valid (integer-second) datetimes after (not including) the reference datetime in the units up to (and including) the datetime that the time coordinate represents." That is, the time coordinate value is not the length of the time interval in SI seconds between the two instants. It is actually the count of the number of valid integer-second datetimes between them. These two quantities are equal except when a leap second intervenes in calendars where leap seconds are not valid datetimes, most importantly the In that sense, rubber seconds are less problematic than leap seconds because, as Chris remarked, the However, the current text isn't adequate in Section 4.4.2, because we don't mention that seconds aren't SI seconds in UTC before 1972, or about the 0.1 s and 0.2 s jumps. As a curiosity, how did they deal with these in datetimes? If 0.1 s was inserted at the end of a day, for instance, did the clock jump from 23:59:60.1 to 0:0:0? The simplest solution is to disallow the In other calendars, before 1972, when real-world datetimes are encoded as time-coordinates, the difference between two time-coordinates is in general not exactly equal to the length of the time-interval between them in SI seconds. Obviously it can't be, before the atomic clock was invented. The further back you go, the less accurate and universal the second was. For me, this underlines that the correct general intepretation of a real-world time-coordinate is as an encoded datetime. This encoding is additionally useful because, since 1972, and except for leap seconds, the difference between two time-coordinates equals the length of the intervening interval, but that's not the fundamental definition. Of course, in idealised model calendars, it's always exactly true. Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
Topic for discussion
A Wise Man once said “If you think you know all about Time, you haven’t thought long enough”. Hence I beg your forgiveness for bringing up the UTC / leap seconds issue after the extended discussion (#304) and the recent inclusion in CF 1.12 (not to mention my missing the PR).
First of all, great job on the revamped section 4.4 and happy to see some new “calendars” added.
That immediately brings me to the first issue: Neither UTC nor TAI are calendars, they are clocks. So rather than than saying for
utc
“A Gregorian calendar with leap seconds as prescribed by UTC”, can this be changed to “Coordinated Universal Time (UTC) which includes leap seconds and where dates are represented using the Gregorian calendar”? And likewise fortai
.In the description of both calendars I would add that time zone offsets are not allowed and that a time zone indication, when given in a datetime, must be 0.
UTC
More concerning, from my perspective, is the definition of
utc
. As defined, it starts in 1958, but UTC did not exist until 1960 and was only officially adopted in 1963 (it was based on that same epoch as TAI, which is 1958-01-01 00:00:00 UT). The issue of leap seconds in UTC is not meaningfully addressed in the new text. Of course, there is section 4.4.3, but that really only deals with the consequences of having to consider leap seconds. I have seen the arguments being made in issue #542 but it feels unbalanced to have the detailed explanation of figure 4.1 but no basic information on leap seconds (such as where to find an authoritative list of the 27 leap seconds since 1972) or how it differs from the other calendars.Further, prior to 1972 there were no leap seconds. The divergence of nearly 10 seconds between UTC and UT that accumulated between 1958-01-01 and 1971-12-31 was applied through three mechanisms: (1) a shifting in the length of the UTC second, at various rates over the period 1960-1971; (2) frequent small increments of multiples of 50ms; and finally (3) an anomalous jump of 0.107758 SI seconds to make the difference exactly 10 seconds at the onset of 1972-01-01. In other words, prior to 1972, the second used by UTC is not equal to the SI second. The introduction of the leap second was made exactly to align the UTC second with the SI second and to make time shifts to align the UTC clock with the celestial reality less burdensome (all of this before the advent of modern computer and communication infrastructure). You may find all the details here.
It is then a bit of a mystery to me why CF would want to dive head-first into this murky 1960-1971 business that many Wise Men have tried to obliterate through the introduction of the leap second. Is there a known use case for data sets using UTC that span the early years of that clock?
I would propose to side-step all of this complication by changing to definition of
utc
to begin at 1972-01-01 00:00:00, being the instant when UTC started using the SI second and leap seconds were introduced, with no time coordinates prior to that instant allowed.units_metadata
I do not understand the purpose of this argument. The
utc
calendar was introduced to enable data producers that need second accuracy to record their observations correctly. When leap seconds are of concern, then why would the data producer use thestandard
orproleptic_gregorian
calendar? Having this need for accuracy but not knowing whether or not the data collection equipment properly records and processes leap seconds does not seem like a real-world combination to me.Why is there reference to the
julian
calendar here? Leap seconds are defined in the context of UTC and that uses the Gregorian calendar exclusively.Finally, why an attribute
units_metadata
with a compound structure? Wouldn’t a simple attributeleap_seconds
suffice?Beta Was this translation helpful? Give feedback.
All reactions