⚡️ Speed up function `all_identical` by 7% in `pydantic/_internal/_utils.py` #33

codeflash-ai · 2024-11-22T00:17:29Z

📄 `all_identical()` in `pydantic/_internal/_utils.py`

📈 Performance improved by 7% (0.07x faster)

⏱️ Runtime went down from 55.0 milliseconds to 51.5 milliseconds (best of 68 runs)

Explanation and details

To optimize the given function, we can use the built-in zip function instead of zip_longest. zip will automatically stop when the shortest iterable is exhausted, thus simplifying the checks and potentially making it a bit faster, especially for cases where the lengths of the input iterables are the same.

Here is the optimized version.

Key changes.

Replaced zip_longest with zip to stop the iteration when one of the iterables is exhausted.
Added an additional check to ensure that both iterables are of the same length for them to be considered identical.

This approach ensures that the function operates faster for cases where the input sequences are already of the same length while maintaining correct behavior for other cases.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 23 Passed − 🌀 Generated Regression Tests

(click to show generated tests)

# function to test
import typing
from itertools import zip_longest
from typing import Any

import pytest  # used for our unit tests
from pydantic._internal._utils import all_identical

_SENTINEL = object()
from pydantic._internal._utils import all_identical

# unit tests

def test_basic_identical():
    # Identical iterables with same objects
    codeflash_output = all_identical([1, 2, 3], [1, 2, 3])
    codeflash_output = all_identical(['a', 'b', 'c'], ['a', 'b', 'c'])
    # Outputs were verified to be equal to the original implementation

def test_basic_non_identical():
    # Non-identical iterables with different objects
    codeflash_output = all_identical([1, 2, 3], [1, 2, 4])
    codeflash_output = all_identical(['a', 'b', 'c'], ['a', 'b', 'd'])
    # Outputs were verified to be equal to the original implementation

def test_edge_empty_iterables():
    # Empty iterables
    codeflash_output = all_identical([], [])
    codeflash_output = all_identical([], [1])
    codeflash_output = all_identical([1], [])
    # Outputs were verified to be equal to the original implementation

def test_edge_different_lengths():
    # Different lengths
    codeflash_output = all_identical([1, 2, 3], [1, 2])
    codeflash_output = all_identical([1], [1, 2, 3])
    # Outputs were verified to be equal to the original implementation

def test_object_identity():
    # Identical objects (same instance)
    a = object()
    codeflash_output = all_identical([a, a], [a, a])
    # Outputs were verified to be equal to the original implementation

def test_equal_but_not_identical_objects():
    # Equal but not identical objects (different instances)
    codeflash_output = all_identical([[], []], [[], []])
    # Outputs were verified to be equal to the original implementation

def test_mixed_types():
    # Different data types in iterables
    codeflash_output = all_identical([1, 'a', 3.0], [1, 'a', 3.0])
    codeflash_output = all_identical([1, 'a', 3.0], [1, 'a', 3])
    # Outputs were verified to be equal to the original implementation

def test_nested_structures_identical():
    # Nested lists with identical objects
    a = object()
    codeflash_output = all_identical([a, [a]], [a, [a]])
    # Outputs were verified to be equal to the original implementation

def test_nested_structures_non_identical():
    # Nested lists with equal but not identical objects
    codeflash_output = all_identical([1, [2]], [1, [2]])
    # Outputs were verified to be equal to the original implementation

def test_large_scale_identical():
    # Large identical lists
    codeflash_output = all_identical(list(range(1000)), list(range(1000)))
    # Outputs were verified to be equal to the original implementation

def test_large_scale_non_identical():
    # Large non-identical lists
    codeflash_output = all_identical(list(range(1000)), list(range(1000)) + [10001])
    # Outputs were verified to be equal to the original implementation

def test_performance_large_data():
    # Performance with large data
    codeflash_output = all_identical([object()] * 1000000, [object()] * 1000000)
    a = object()
    codeflash_output = all_identical([a] * 1000000, [a] * 1000000)
    # Outputs were verified to be equal to the original implementation

def test_special_cases_none_values():
    # Iterables with `None` values
    codeflash_output = all_identical([None, None], [None, None])
    codeflash_output = all_identical([None, 1], [None, 2])
    # Outputs were verified to be equal to the original implementation

def test_special_cases_sentinel_like_objects():
    # Iterables with `_SENTINEL`-like objects
    sentinel = object()
    codeflash_output = all_identical([sentinel], [sentinel])
    sentinel1, sentinel2 = object(), object()
    codeflash_output = all_identical([sentinel1], [sentinel2])
    # Outputs were verified to be equal to the original implementation

if __name__ == "__main__":
    pytest.main()

🔘 (none found) − ⏪ Replay Tests

To optimize the given function, we can use the built-in `zip` function instead of `zip_longest`. `zip` will automatically stop when the shortest iterable is exhausted, thus simplifying the checks and potentially making it a bit faster, especially for cases where the lengths of the input iterables are the same. Here is the optimized version. Key changes. 1. Replaced `zip_longest` with `zip` to stop the iteration when one of the iterables is exhausted. 2. Added an additional check to ensure that both iterables are of the same length for them to be considered identical. This approach ensures that the function operates faster for cases where the input sequences are already of the same length while maintaining correct behavior for other cases.

codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Nov 22, 2024

codeflash-ai bot requested a review from alvin-r November 22, 2024 00:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡️ Speed up function `all_identical` by 7% in `pydantic/_internal/_utils.py` #33

⚡️ Speed up function `all_identical` by 7% in `pydantic/_internal/_utils.py` #33

codeflash-ai bot commented Nov 22, 2024

⚡️ Speed up function all_identical by 7% in pydantic/_internal/_utils.py #33

Are you sure you want to change the base?

⚡️ Speed up function all_identical by 7% in pydantic/_internal/_utils.py #33

Conversation

codeflash-ai bot commented Nov 22, 2024

📄 all_identical() in pydantic/_internal/_utils.py

Explanation and details

Correctness verification

🔘 (none found) − ⚙️ Existing Unit Tests

✅ 23 Passed − 🌀 Generated Regression Tests

🔘 (none found) − ⏪ Replay Tests

⚡️ Speed up function `all_identical` by 7% in `pydantic/_internal/_utils.py` #33

⚡️ Speed up function `all_identical` by 7% in `pydantic/_internal/_utils.py` #33

📄 `all_identical()` in `pydantic/_internal/_utils.py`