Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚡️ Speed up cell_len() by 44% in rich/cells.py #20

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

codeflash-ai[bot]
Copy link

@codeflash-ai codeflash-ai bot commented Jul 3, 2024

📄 cell_len() in rich/cells.py

📈 Performance improved by 44% (0.44x faster)

⏱️ Runtime went down from 378 milliseconds to 263 milliseconds

Explanation and details

Here is the rewritten program optimized for runtime and memory requirements.

Explanation of Optimization.

  1. Reduced Redundant Variable Assignment:

    • Removed the assignment _get_size = get_character_cell_size inside the cell_len function to directly use get_character_cell_size in the sum function.
  2. Use of map Function:

    • Replaced the generator expression within sum with the map function, which is often faster and more memory-efficient as it applies the function directly to elements in the text.

Using the map function streamlines the application of get_character_cell_size to each character in text, reducing overhead associated with Python's high-level looping constructs. This small change can lead to performance improvements, especially noticeable with larger inputs that exceed the threshold of 512 characters.

Correctness verification

The new optimized code was tested for correctness. The results are listed below.

✅ 7 Passed − ⚙️ Existing Unit Tests

(click to show existing tests)
- test_cells.py

✅ 28 Passed − 🌀 Generated Regression Tests

(click to show generated tests)
# imports
# function to test
from __future__ import annotations

from functools import lru_cache
from typing import Callable

import pytest  # used for our unit tests
from rich.cells import cell_len


# Mocking the dependencies
def _get_codepoint_cell_size(codepoint: int) -> int:
    """Mock function to simulate cell size based on codepoint."""
    if codepoint < 128:
        return 1  # ASCII characters
    elif 0x1100 <= codepoint <= 0x115F or 0x2329 <= codepoint <= 0x232A or 0x2E80 <= codepoint <= 0xA4CF or 0xAC00 <= codepoint <= 0xD7A3 or 0xF900 <= codepoint <= 0xFAFF or 0xFE10 <= codepoint <= 0xFE19 or 0xFE30 <= codepoint <= 0xFE6F or 0xFF01 <= codepoint <= 0xFF60 or 0xFFE0 <= codepoint <= 0xFFE6:
        return 2  # Wide characters
    else:
        return 1  # Default to 1 for other characters

def cached_cell_len(text: str) -> int:
    """Mock function to simulate cached cell length calculation."""
    return sum(get_character_cell_size(ch) for ch in text)

# unit tests
def test_basic_functionality():
    # Empty string
    assert cell_len("") == 0

    # Single ASCII characters
    assert cell_len("a") == 1
    assert cell_len("Z") == 1

    # Multiple ASCII characters
    assert cell_len("hello") == 5
    assert cell_len("world") == 5

def test_unicode_characters():
    # Single Unicode characters
    assert cell_len("é") == 1  # Assuming é is treated as a single cell character
    assert cell_len("你") == 2

    # Multiple Unicode characters
    assert cell_len("你好") == 4
    assert cell_len("こんにちは") == 10

def test_mixed_ascii_unicode():
    # Mixed string
    assert cell_len("hello世界") == 9
    assert cell_len("abc你def") == 8

def test_edge_cases():
    # String with special characters
    assert cell_len("\n") == 1
    assert cell_len("\t") == 1

    # String with emojis
    assert cell_len("🙂") == 2
    assert cell_len("👍👍") == 4

    # String with combining characters
    assert cell_len("e\u0301") == 1  # Assuming e + combining acute accent is treated as single cell

def test_performance_and_scalability():
    # Short string (less than 512 characters)
    assert cell_len("a" * 100) == 100
    assert cell_len("😀" * 100) == 200

    # Long string (512 characters or more)
    assert cell_len("a" * 512) == 512
    assert cell_len("😀" * 512) == 1024
    assert cell_len("a" * 1000) == 1000
    assert cell_len("😀" * 1000) == 2000

def test_error_handling():
    # Invalid input types
    with pytest.raises(TypeError):
        cell_len(None)
    with pytest.raises(TypeError):
        cell_len(123)

    # Invalid characters
    with pytest.raises(ValueError):
        cell_len("a\udc00")

def test_large_scale():
    # Very large string
    assert cell_len("a" * 1000000) == 1000000
    assert cell_len("😀" * 1000000) == 2000000

def test_boundary_conditions():
    # Boundary around 512 characters
    assert cell_len("a" * 511) == 511
    assert cell_len("a" * 512) == 512
    assert cell_len("a" * 513) == 513

🔘 (none found) − ⏪ Replay Tests

Here is the rewritten program optimized for runtime and memory requirements.



### Explanation of Optimization.

1. **Reduced Redundant Variable Assignment:**
   - Removed the assignment `_get_size = get_character_cell_size` inside the `cell_len` function to directly use `get_character_cell_size` in the `sum` function.

2. **Use of `map` Function:**
   - Replaced the generator expression within `sum` with the `map` function, which is often faster and more memory-efficient as it applies the function directly to elements in the text.

Using the `map` function streamlines the application of `get_character_cell_size` to each character in `text`, reducing overhead associated with Python's high-level looping constructs. This small change can lead to performance improvements, especially noticeable with larger inputs that exceed the threshold of 512 characters.
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Jul 3, 2024
@codeflash-ai codeflash-ai bot requested a review from iusedmyimagination July 3, 2024 02:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚡️ codeflash Optimization PR opened by Codeflash AI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants