Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-89083: add support for UUID version 6 (RFC 9562) #120650

Open
wants to merge 63 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
818d417
add implementation
picnixz Jun 17, 2024
16565f2
add tests
picnixz Jun 17, 2024
e6c1d5f
add docs
picnixz Jun 17, 2024
cbaaff4
add WhatsNew
picnixz Jun 17, 2024
4ef04b9
blurb
picnixz Jun 17, 2024
943d13e
fix a mask
picnixz Jun 17, 2024
8344e64
fix random bytes generation
picnixz Jun 17, 2024
295d82d
fixup some comments
picnixz Jun 17, 2024
1aaa483
Update Lib/uuid.py
picnixz Jun 21, 2024
6847b77
Update Doc/whatsnew/3.14.rst
picnixz Jun 22, 2024
4d9862e
revert modifications on properties for now
picnixz Jun 22, 2024
08607f7
fixup
picnixz Jun 22, 2024
55edd0c
update variable names
picnixz Jun 22, 2024
5b15134
remove references to v7 and v8
picnixz Jun 28, 2024
e84cf17
Merge branch 'main' into uuid-rfc-9562
picnixz Jul 22, 2024
9bc8090
Merge branch 'main' into uuid-v6-89083
picnixz Aug 21, 2024
c3d4745
add UUIDv8 implementation
picnixz Aug 22, 2024
392d289
add tests
picnixz Aug 22, 2024
26889ea
blurb
picnixz Aug 22, 2024
44b66e6
add What's New entry
picnixz Aug 22, 2024
7be6dc4
add docs
picnixz Aug 22, 2024
a276857
add test vectors
picnixz Aug 22, 2024
8ba3d8b
Improve hexadecimal masks reading
picnixz Sep 25, 2024
a14ae9b
add uniqueness test
picnixz Sep 25, 2024
7a169c9
Update mentions to RFC 4122 to RFC 4122/9562 when possible.
picnixz Sep 25, 2024
b082c90
Update docs
picnixz Sep 25, 2024
94c70e9
Merge branch 'main' into uuid-v8-89083
picnixz Sep 25, 2024
4907650
Merge branch 'main' into uuid-rfc-9562
hugovk Nov 2, 2024
275deb7
Merge branch 'main' into uuid-v8-89083
hugovk Nov 2, 2024
5e97cc3
Apply suggestions from code review
picnixz Nov 11, 2024
051f34e
Update Lib/test/test_uuid.py
picnixz Nov 11, 2024
bdf9a77
Apply suggestions from code review
picnixz Nov 11, 2024
394a805
Merge branch 'main' into uuid-v6-89083
picnixz Nov 13, 2024
a40c19b
Merge remote-tracking branch 'origin/uuid-v8-89083' into uuid-v6-89083
picnixz Nov 13, 2024
00661fc
Merge remote-tracking branch 'origin/uuid-v8-89083'
picnixz Nov 13, 2024
c188ced
post-merge
picnixz Nov 13, 2024
7e5d364
update docs
picnixz Nov 13, 2024
b8ddc02
improve readability
picnixz Nov 13, 2024
384a02e
improve test readability
picnixz Nov 13, 2024
e4a7198
improve test coverage
picnixz Nov 13, 2024
aed5839
update docs
picnixz Nov 14, 2024
6daae22
Merge remote-tracking branch 'origin/uuid-rfc-9562' into uuid-v6-89083
picnixz Nov 14, 2024
bca3776
Merge remote-tracking branch 'upstream/main' into uuid-v6-89083
picnixz Nov 14, 2024
2df6f41
Merge remote-tracking branch 'upstream/main'
picnixz Nov 15, 2024
2d003fa
Merge branch 'main' into uuid-v6-89083
picnixz Nov 15, 2024
5ad6268
post-merge
picnixz Nov 15, 2024
d49855d
post-merge
picnixz Nov 15, 2024
a5682f8
fix comment
picnixz Nov 15, 2024
969f1c5
Merge branch 'main' into uuid-v6-89083
picnixz Nov 15, 2024
cc459dd
use versionchanged instead of versionadded
picnixz Nov 15, 2024
0d9f687
fix typo
picnixz Nov 15, 2024
d1a6cd8
Merge branch 'main' into uuid-rfc-9562
picnixz Nov 16, 2024
d70279f
Cosmetic change
picnixz Nov 16, 2024
b8a0e72
Update Doc/whatsnew/3.14.rst
picnixz Nov 16, 2024
6c6339b
fix lint
picnixz Nov 16, 2024
1e927b6
Update Lib/test/test_uuid.py
picnixz Nov 17, 2024
68394e6
Update Lib/test/test_uuid.py
picnixz Nov 17, 2024
c4a696d
Merge branch 'main' into uuid-rfc-9562
picnixz Dec 5, 2024
e47df67
update docs
picnixz Dec 19, 2024
2de0a05
improve UUIDv6 uniqueness tests
picnixz Dec 19, 2024
b55adc4
explain rationale of randomized clock sequence
picnixz Dec 19, 2024
6c938d7
Merge branch 'main' into uuid-rfc-9562
picnixz Dec 21, 2024
09ee619
Merge branch 'main' into uuid-rfc-9562
picnixz Jan 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 32 additions & 1 deletion Doc/library/uuid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -149,9 +149,12 @@ which relays any information about the UUID's safety, using this enumeration:

.. attribute:: UUID.version

The UUID version number (1 through 5, meaningful only when the variant is
The UUID version number (1 through 8, meaningful only when the variant is
:const:`RFC_4122`).

.. versionadded:: 3.14
Added UUID versions 6, 7, and 8.

.. attribute:: UUID.is_safe

An enumeration of :class:`SafeUUID` which indicates whether the platform
Expand Down Expand Up @@ -216,6 +219,34 @@ The :mod:`uuid` module defines the following functions:

.. index:: single: uuid5


.. function:: uuid6(node=None, clock_seq=None)

TODO

.. versionadded:: 3.14

.. index:: single: uuid6


.. function:: uuid7()

TODO

.. versionadded:: 3.14

.. index:: single: uuid7


.. function:: uuid8(a=None, b=None, c=None)

TODO

.. versionadded:: 3.14

.. index:: single: uuid8


The :mod:`uuid` module defines the following namespace identifiers for use with
:func:`uuid3` or :func:`uuid5`.

Expand Down
11 changes: 11 additions & 0 deletions Doc/whatsnew/3.14.rst
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,17 @@ symtable

(Contributed by Bénédikt Tran in :gh:`120029`.)

uuid
----

* Add support for UUID versions 6, 7, and 8 as specified by
:rfc:`9562` to the :mod:`uuid` module:

* :meth:`~uuid.uuid6`
* :meth:`~uuid.uuid7`
* :meth:`~uuid.uuid8`
picnixz marked this conversation as resolved.
Show resolved Hide resolved

(Contributed by Bénédikt Tran in :gh:`89083`.)

Optimizations
=============
Expand Down
87 changes: 84 additions & 3 deletions Lib/test/test_uuid.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import random
import unittest
from test import support
from test.support import import_helper
Expand All @@ -10,6 +11,7 @@
import pickle
import sys
import weakref
from itertools import product
from unittest import mock

py_uuid = import_helper.import_fresh_module('uuid', blocked=['_uuid'])
Expand Down Expand Up @@ -267,7 +269,7 @@ def test_exceptions(self):

# Version number out of range.
badvalue(lambda: self.uuid.UUID('00'*16, version=0))
badvalue(lambda: self.uuid.UUID('00'*16, version=6))
badvalue(lambda: self.uuid.UUID('00'*16, version=42))

# Integer value out of range.
badvalue(lambda: self.uuid.UUID(int=-1))
Expand Down Expand Up @@ -588,15 +590,15 @@ def test_uuid1_bogus_return_value(self):

def test_uuid1_time(self):
with mock.patch.object(self.uuid, '_generate_time_safe', None), \
mock.patch.object(self.uuid, '_last_timestamp', None), \
mock.patch.object(self.uuid, '_last_timestamp_v1', None), \
mock.patch.object(self.uuid, 'getnode', return_value=93328246233727), \
mock.patch('time.time_ns', return_value=1545052026752910643), \
mock.patch('random.getrandbits', return_value=5317): # guaranteed to be random
u = self.uuid.uuid1()
self.assertEqual(u, self.uuid.UUID('a7a55b92-01fc-11e9-94c5-54e1acf6da7f'))

with mock.patch.object(self.uuid, '_generate_time_safe', None), \
mock.patch.object(self.uuid, '_last_timestamp', None), \
mock.patch.object(self.uuid, '_last_timestamp_v1', None), \
mock.patch('time.time_ns', return_value=1545052026752910643):
u = self.uuid.uuid1(node=93328246233727, clock_seq=5317)
self.assertEqual(u, self.uuid.UUID('a7a55b92-01fc-11e9-94c5-54e1acf6da7f'))
Expand Down Expand Up @@ -681,6 +683,85 @@ def test_uuid5(self):
equal(u, self.uuid.UUID(v))
equal(str(u), v)

def test_uuid6(self):
equal = self.assertEqual
u = self.uuid.uuid6()
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 6)

fake_nanoseconds = 1545052026752910643
fake_node_value = 93328246233727
fake_clock_seq = 5317
with mock.patch.object(self.uuid, '_generate_time_safe', None), \
mock.patch.object(self.uuid, '_last_timestamp_v6', None), \
mock.patch.object(self.uuid, 'getnode', return_value=fake_node_value), \
mock.patch('time.time_ns', return_value=fake_nanoseconds), \
mock.patch('random.getrandbits', return_value=fake_clock_seq):
u = self.uuid.uuid6()
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 6)

# time_hi time_mid time_lo
# 00011110100100000001111111001010 0111101001010101 101110010010
timestamp = 137643448267529106
equal(u.time_hi, 0b00011110100100000001111111001010)
equal(u.time_mid, 0b0111101001010101)
equal(u.time_low, 0b101110010010)
equal(u.time, timestamp)
equal(u.fields[0], u.time_hi)
equal(u.fields[1], u.time_mid)
equal(u.fields[2], u.time_hi_version)

def test_uuid7(self):
equal = self.assertEqual
u = self.uuid.uuid7()
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)

fake_nanoseconds = 1545052026752910643
# some fake 74 = 12 + 62 random bits speared over 76 bits
# are generated by generating a random 76-bit number, and
# split into chunks of 62 (hi) and 12 (lo) bits.
for _ in range(100):
rand_a = random.getrandbits(12)
rand_b = random.getrandbits(62)
fake_bytes = (rand_b << 12) | rand_a
fake_bytes = fake_bytes.to_bytes(10, byteorder='big')

with mock.patch.object(self.uuid, '_last_timestamp_v7', None), \
mock.patch('time.time_ns', return_value=fake_nanoseconds), \
mock.patch('os.urandom', return_value=fake_bytes):
u = self.uuid.uuid7()
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 7)
fake_milliseconds = (fake_nanoseconds // 1_000_000) & 0xffffffffffff
equal((u.int >> 80) & 0xffffffffffff, fake_milliseconds)
equal((u.int >> 64) & 0x0fff, rand_a)
equal(u.int & 0x3fffffffffffffff, rand_b)

def test_uuid8(self):
equal = self.assertEqual
u = self.uuid.uuid8()

equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 8)

for (_, hi, mid, lo) in product(
range(10), # repeat 10 times
[None, 0, random.getrandbits(48)],
[None, 0, random.getrandbits(12)],
[None, 0, random.getrandbits(62)],
):
u = self.uuid.uuid8(hi, mid, lo)
equal(u.variant, self.uuid.RFC_4122)
equal(u.version, 8)
if hi is not None:
equal((u.int >> 80) & 0xffffffffffff, hi)
if mid is not None:
equal((u.int >> 64) & 0xfff, mid)
if lo is not None:
equal(u.int & 0x3fffffffffffffff, lo)

@support.requires_fork()
def testIssue8621(self):
# On at least some versions of OSX self.uuid.uuid4 generates
Expand Down
121 changes: 109 additions & 12 deletions Lib/uuid.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
r"""UUID objects (universally unique identifiers) according to RFC 4122.

This module provides immutable UUID objects (class UUID) and the functions
uuid1(), uuid3(), uuid4(), uuid5() for generating version 1, 3, 4, and 5
UUIDs as specified in RFC 4122.
uuid1(), uuid3(), uuid4(), uuid5(), uuid6(), uuid7(), and uuid8() for
generating version 1 to 8 UUIDs as specified in RFC 4122 (superseeded
by RFC 9562 but still referred to as RFC 4122 for compatibility purposes).

If all you want is a unique ID, you should probably call uuid1() or uuid4().
Note that uuid1() may compromise privacy since it creates a UUID containing
Expand Down Expand Up @@ -129,7 +130,7 @@ class UUID:
variant the UUID variant (one of the constants RESERVED_NCS,
RFC_4122, RESERVED_MICROSOFT, or RESERVED_FUTURE)

version the UUID version number (1 through 5, meaningful only
version the UUID version number (1 through 8, meaningful only
when the variant is RFC_4122)

is_safe An enum indicating whether the UUID has been generated in
Expand Down Expand Up @@ -214,7 +215,7 @@ def __init__(self, hex=None, bytes=None, bytes_le=None, fields=None,
if not 0 <= int < 1<<128:
raise ValueError('int is out of range (need a 128-bit value)')
if version is not None:
if not 1 <= version <= 5:
if not 1 <= version <= 8:
raise ValueError('illegal version number')
# Set the variant to RFC 4122.
int &= ~(0xc000 << 48)
Expand Down Expand Up @@ -297,17 +298,29 @@ def bytes_le(self):

@property
def fields(self):
if self.version == 6:
# the first field should be a 32-bit integer
return (self.time_hi, self.time_mid, self.time_hi_version,
self.clock_seq_hi_variant, self.clock_seq_low, self.node)
return (self.time_low, self.time_mid, self.time_hi_version,
self.clock_seq_hi_variant, self.clock_seq_low, self.node)

@property
def time_low(self):
if self.version == 6:
return (self.int >> 64) & 0x0fff
return self.int >> 96

@property
def time_mid(self):
return (self.int >> 80) & 0xffff

@property
def time_hi(self):
if self.version == 6:
return self.int >> 96
return (self.int >> 64) & 0x0fff

@property
def time_hi_version(self):
return (self.int >> 64) & 0xffff
Expand All @@ -322,8 +335,9 @@ def clock_seq_low(self):

@property
def time(self):
return (((self.time_hi_version & 0x0fff) << 48) |
(self.time_mid << 32) | self.time_low)
if self.version == 6:
picnixz marked this conversation as resolved.
Show resolved Hide resolved
return (self.time_hi << 28) | (self.time_mid << 12) | self.time_low
return (self.time_hi << 48) | (self.time_mid << 32) | self.time_low

@property
def clock_seq(self):
Expand Down Expand Up @@ -656,7 +670,7 @@ def getnode():
assert False, '_random_getnode() returned invalid value: {}'.format(_node)


_last_timestamp = None
_last_timestamp_v1 = None

def uuid1(node=None, clock_seq=None):
"""Generate a UUID from a host ID, sequence number, and the current time.
Expand All @@ -674,15 +688,15 @@ def uuid1(node=None, clock_seq=None):
is_safe = SafeUUID.unknown
return UUID(bytes=uuid_time, is_safe=is_safe)

global _last_timestamp
global _last_timestamp_v1
import time
nanoseconds = time.time_ns()
# 0x01b21dd213814000 is the number of 100-ns intervals between the
# UUID epoch 1582-10-15 00:00:00 and the Unix epoch 1970-01-01 00:00:00.
timestamp = nanoseconds // 100 + 0x01b21dd213814000
if _last_timestamp is not None and timestamp <= _last_timestamp:
timestamp = _last_timestamp + 1
_last_timestamp = timestamp
if _last_timestamp_v1 is not None and timestamp <= _last_timestamp_v1:
timestamp = _last_timestamp_v1 + 1
_last_timestamp_v1 = timestamp
if clock_seq is None:
import random
clock_seq = random.getrandbits(14) # instead of stable storage
Expand Down Expand Up @@ -719,14 +733,97 @@ def uuid5(namespace, name):
hash = sha1(namespace.bytes + name).digest()
return UUID(bytes=hash[:16], version=5)

_last_timestamp_v6 = None

def uuid6(node=None, clock_seq=None):
"""Similar to :func:`uuid1` but where fields are ordered differently
for improved DB locality.

More precisely, given a 60-bit timestamp value as specified for UUIDv1,
for UUIDv6 the first 48 most significant bits are stored first, followed
by the 4-bit version (same position), followed by the remaining 12 bits
of the original 60-bit timestamp.
"""
global _last_timestamp_v6
picnixz marked this conversation as resolved.
Show resolved Hide resolved
import time
nanoseconds = time.time_ns()
# 0x01b21dd213814000 is the number of 100-ns intervals between the
# UUID epoch 1582-10-15 00:00:00 and the Unix epoch 1970-01-01 00:00:00.
timestamp = nanoseconds // 100 + 0x01b21dd213814000
if _last_timestamp_v6 is not None and timestamp <= _last_timestamp_v6:
timestamp = _last_timestamp_v6 + 1
_last_timestamp_v6 = timestamp
if clock_seq is None:
import random
clock_seq = random.getrandbits(14) # instead of stable storage
time_hi_and_mid = (timestamp >> 12) & 0xffffffffffff
time_ver_and_lo = timestamp & 0x0fff
var_and_clock_s = clock_seq & 0x3fff
if node is None:
node = getnode()
int_uuid_6 = time_hi_and_mid << 80
int_uuid_6 |= time_ver_and_lo << 64
int_uuid_6 |= var_and_clock_s << 48
int_uuid_6 |= node & 0xffffffffffff
return UUID(int=int_uuid_6, version=6)

_last_timestamp_v7 = None

def uuid7():
"""Generate a UUID from a Unix timestamp in milliseconds and random bits."""
global _last_timestamp_v7
import time
nanoseconds = time.time_ns()
timestamp_ms = nanoseconds // 10 ** 6 # may be improved
picnixz marked this conversation as resolved.
Show resolved Hide resolved
if _last_timestamp_v7 is not None and timestamp_ms <= _last_timestamp_v7:
timestamp_ms = _last_timestamp_v7 + 1
_last_timestamp_v7 = timestamp_ms
int_uuid_7 = (timestamp_ms & 0xffffffffffff) << 80
# Ideally, we would have 'rand_a' = first 12 bits of 'rand'
# and 'rand_b' = lowest 62 bits, but it is easier to test
# when we pick 'rand_a' from the lowest bits of 'rand' and
# 'rand_b' from the next 62 bits, ignoring the 6 first bits
# of 'rand'.
rand = int.from_bytes(os.urandom(10)) # 80 random bits (ignore 6 first)
int_uuid_7 |= (rand & 0x0fff) << 64 # rand_a
int_uuid_7 |= (rand >> 12) & 0x3fffffffffffffff # rand_b
return UUID(int=int_uuid_7, version=7)

def uuid8(a=None, b=None, c=None):
"""Generate a UUID from three custom blocks.

'a' is the first 48-bit chunk of the UUID (octets 0-5);
'b' is the mid 12-bit chunk (octets 6-7);
'c' is the last 62-bit chunk (octets 8-15).

When a value is not specified, a random value is generated.
"""
if a is None:
import random
a = random.getrandbits(48)
if b is None:
import random
b = random.getrandbits(12)
if c is None:
import random
c = random.getrandbits(62)

int_uuid_8 = (a & 0xffffffffffff) << 80
int_uuid_8 |= (b & 0xfff) << 64
int_uuid_8 |= c & 0x3fffffffffffffff
return UUID(int=int_uuid_8, version=8)


def main():
"""Run the uuid command line interface."""
uuid_funcs = {
"uuid1": uuid1,
"uuid3": uuid3,
"uuid4": uuid4,
"uuid5": uuid5
"uuid5": uuid5,
"uuid6": uuid6,
"uuid7": uuid7,
"uuid8": uuid8,
}
uuid_namespace_funcs = ("uuid3", "uuid5")
namespaces = {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :func:`~uuid.uuid6`, :func:`~uuid.uuid7` and :func:`~uuid.uuid8` to the
:mod:`uuid` module as specified by :rfc:`9562`. Patch by Bénédikt Tran.
Loading