Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IH-577 Implement v7 UUID generation #5866

Merged
merged 1 commit into from
Oct 16, 2024

Conversation

rdn32
Copy link

@rdn32 rdn32 commented Jul 19, 2024

  • New function Uuidx.make_v7_uuid, with the idea being that ordering v7 UUIDs alphabetically will also order them by creation time
  • The values produced by Uuidx.make_uuid_urnd hadn't necessarily been valid UUIDs, since the variant and version fields were being filled in randomly - this is now fixed so that it returns v4 UUIDs.
  • There are a couple of functions for generating v4 and v7 from known inputs, for the purpose of unit testing. (The v4 function is mainly there so I could check the setting of variant and version fields by comparing the output with that which Python's UUID module produces.)

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch from 7b9e9f6 to c5085e6 Compare July 19, 2024 13:08
val make_v7_uuid_from_time_and_bytes : int * int64 -> int64 -> 'a t
(** For testing only: create a v7 UUID, as defined in RFC 9562 5.7 *)

val make_v7_uuid : unit -> 'a t
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be changed to accept a function for the timestamp? That way we could use a monotonic clock source. The signature could be (unit -> uint64) -> 'a t
I know it would not be standard-compliant, and we would need to be careful about not persisting the produced timestamps, but I believe losing monotonicity here is something very undesired.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone for a hybrid approach whereby I use ptime to get the system time initially, then use mtime to get monotonic adjustments to that - so that we keep monotonicity but still have timestamps that are at least approximately correct.

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch 2 times, most recently from f416ed4 to 037eef9 Compare August 9, 2024 12:31
@lindig
Copy link
Contributor

lindig commented Aug 12, 2024

Is there a chance to upstream this code? We are using uuidm and it would be nice to expose this to more scrutiny by upstreaming it.

@psafont
Copy link
Member

psafont commented Aug 22, 2024

Is there a chance to upstream this code? We are using uuidm and it would be nice to expose this to more scrutiny by upstreaming it.

Because it's a new standard, it has leeway on how it can be implemented, and it needs to be as an additional package, I think there's a low chance to get into upstream. Having the suggestion upstream would maybe make its author to implement it

@rdn32
Copy link
Author

rdn32 commented Aug 30, 2024

It looks like generating a timestamp can't reasonably be done by uuidm (without introducing a dependency) but constructing v7 UUID given a timestamp could be. I can put together a PR. and see how well it goes down.

@rdn32
Copy link
Author

rdn32 commented Sep 13, 2024

It looks like generating a timestamp can't reasonably be done by uuidm (without introducing a dependency) but constructing v7 UUID given a timestamp could be. I can put together a PR. and see how well it goes down.

The PR is dbuenzli/uuidm#14. Daniel Bünzli has added a '👍' reaction, which I take to be a good sign.

@psafont
Copy link
Member

psafont commented Sep 13, 2024

He usually commits his own version of the change while referencing the original author

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch from 037eef9 to 24a4f75 Compare September 13, 2024 14:39
@dbuenzli
Copy link

Please chime in on the PR to discuss this. I think it's a good idea to add basic support for v7 in uuidm.

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch from 24a4f75 to bbb2135 Compare September 27, 2024 17:25
@lindig
Copy link
Contributor

lindig commented Sep 30, 2024

I understand that v7 UUID will be implemented upstream as I suggested - which is great. Could you summarize what this PR achieves?

@rdn32
Copy link
Author

rdn32 commented Oct 4, 2024

I understand that v7 UUID will be implemented upstream as I suggested - which is great. Could you summarize what this PR achieves?

The main thing it does is to give uuidx a wrapper for uuidm's new v7 UUID generation - including producing monotonic-yet-broadly-accurate timestamps.

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch from bbb2135 to 2817fc2 Compare October 11, 2024 10:56
let since_t0 = Mtime_clock.count start |> Mtime.Span.to_uint64_ns in
Int64.add t0 since_t0

let make_v7_uuid () = make_v7_uuid_from_parts (now_ns ()) (rand64 ())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only 2048 possible valued for the rand64, how are we avoiding duplicates?

In general, I'd prefer to use a counter here, it would ensure that no duplicates are ever produced, and that sequentially-created uuids are properly ordered.

However, the rates needed for this to happen are rather high, and I doubt this affects how the toolstack will use the uuids, this can be changed at a later time if we found it's an impediment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only 2048 possible valued for the rand64, how are we avoiding duplicates?

I'd been assuming that Random.State.bits64 would give 2^64 different values. We're only using 62 of its bits, but that still gives us about 4 * 10^18 different values, which I would have thought we could rely on chance to avoid getting duplicate values within 245 nanoseconds of each other. Or is the Random module very much worse than I'd imagined?

@lindig
Copy link
Contributor

lindig commented Oct 15, 2024

Can we run a make format on this to satisfy the checks?

@@ -3,8 +3,13 @@
(public_name uuid)
(modules uuidx)
(libraries
unix (re_export uuidm)
mtime
mtime.clock.os
Copy link

@dbuenzli dbuenzli Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that both {ptime,mtime}.clock.os are deprecated use {ptime,mtime}.clock if you are using the latest version (also clocks are made to re-export the base package so you can also specify only those if you wish so). See https://discuss.ocaml.org/t/ann-ptime-1-2-0-mtime-2-1-0-qrc-0-2-0/15268

@rdn32 rdn32 force-pushed the private/rnewton/IH-577-uuidv7 branch from 2817fc2 to a2d9fbe Compare October 15, 2024 15:45
* New function Uuidx.make_v7_uuid, with the idea being that ordering v7
  UUIDs alphabetically will also order them by creation time. This
  requires uuidm v0.9.9, as that contains the code for constructing a
  v7 UUID from a time and some random bytes.
* There is a function for generating v7 from known inputs, for the
  purpose of unit testing. Arguably this is pointless to have unit tests
  for third-party code, but the tests were written to test code that was
  submitted to uuidm only later, and I'm always loathe to delete tests.

Signed-off-by: Robin Newton <[email protected]>
@lindig lindig added this pull request to the merge queue Oct 16, 2024
Merged via the queue into xapi-project:master with commit e650ca0 Oct 16, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants