Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Retries for Version Mismatch Errors #420

Open
subhashb opened this issue May 16, 2024 · 0 comments
Open

Automatic Retries for Version Mismatch Errors #420

subhashb opened this issue May 16, 2024 · 0 comments
Labels
architecture Needs Architecture Discussion enhancement New feature or request proposal

Comments

@subhashb
Copy link
Contributor

Abstract

In Protean, entities and aggregates are versioned to ensure consistency and prevent conflicts. When an operation is performed, the system expects the entity to be at a specific version, and an ExpectedVersionError is thrown if there is a version mismatch. This proposal outlines the implementation of an automatic retry mechanism in the framework to handle ExpectedVersionError occurrences. The retry mechanism will be customizable via configuration and applied through a retry decorator on Application Service, Command Handler, or Event Handler methods.

Background

In Protean, each entity and aggregate is assigned a version number (_version attribute). This version number increments with every change to the entity, ensuring that operations on the entity are conflict-free. When an operation is performed, the system expects the entity to be at a specific version. If another operation has modified the entity since it was last read, the version numbers will not match, leading to an ExpectedVersionError.

Such errors are common in systems using optimistic concurrency control and can usually be resolved by retrying the operation. To address this, Protean should support automatic retries, ensuring that operations can be safely retried without manual intervention, thus improving system resilience and user experience.

Proposal

We propose implementing an automatic retry mechanism for handling ExpectedVersionError. The key features of this implementation are:

  • Automatic Retry Mechanism: The framework will provide out-of-the-box support for automatic retries on ExpectedVersionError.
  • Configurable Parameters: The retry mechanism will be customizable through configuration settings. Parameters such as the number of retries, retry interval, and retry strategies (e.g., fixed interval, exponential backoff, jitter) will be configurable.
  • Retry Decorator: Application Service, Command Handler, or Event Handler methods must be annotated with a retry decorator to enable automatic retries. The decorator will accept parameters such as the retry strategy, maximum number of retries, and other relevant settings.
  • Default Configuration: The system will include a default configuration for retries, which will be used when no explicit parameters are provided in the decorator. This ensures a sensible default behavior while allowing for customization.
  • Customization and Flexibility: The number of retries and retry strategies should consider factors such as system load, business requirements, resource constraints, and failure rates. The retry strategies mentioned earlier (fixed interval, exponential backoff, jitter) will be available as choices.

Justification

Implementing an automatic retry mechanism for ExpectedVersionError in Protean will allow us to:

  • Improve Consistency and Reliability: By automatically retrying operations that encounter version mismatches, most race condition related issues can be resolved without manual intervention.
  • Enhanced User Experience: Automatic retries reduce the likelihood of errors propagating to end-users.
  • Customizability and Flexibility: Allowing configuration of retry parameters and strategies ensures that each business functionality can be tailored to meet specific requirements and operational contexts.
  • Resource Efficiency: Implementing intelligent retry strategies (such as exponential backoff with jitter) helps manage system load and avoid overwhelming resources.

Implementation Considerations

  • Retry Decorator: Implement a decorator that can be applied to methods in Application Services, Command Handlers, or Event Handlers. The decorator should accept parameters such as retry strategy, maximum retries, and other relevant settings.
  • Configuration Management: Introduce configuration settings for retries at the domain level, which includes intelligent defaults.
  • Retry Strategies: Implement the following retry strategies:
    • Fixed Interval: Retry after a fixed delay.
    • Exponential Backoff: Increase the delay exponentially between retries.
    • Jitter: Add randomness to the delay to prevent synchronized retries from multiple clients.
  • Integration and Testing: Simulate race conditions to validate the functionality and performance under various scenarios and loads.
  • Documentation and Examples: Provide comprehensive documentation and examples to guide users on how to configure and use the retry mechanism effectively. Include best practices and recommendations based on different use cases and system conditions.

Sample Code

Retry Decorator Implementation

def retry(retry_strategy='exponential_backoff', max_retries=5, initial_delay=100, jitter=50):
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            retries = 0
            delay = initial_delay
            while retries < max_retries:
                try:
                    return func(*args, **kwargs)
                except ExpectedVersionError:
                    retries += 1
                    if retry_strategy == 'fixed':
                        sleep(delay / 1000)
                    elif retry_strategy == 'exponential_backoff':
                        sleep((delay + random.uniform(0, jitter)) / 1000)
                        delay *= 2
                    else:
                        sleep((delay + random.uniform(0, jitter)) / 1000)
            raise ExpectedVersionError(f"Failed after {max_retries} retries")
        return wrapper
    return decorator

Sample Config

[retry]
default_strategy = "exponential_backoff"
max_retries = 5
initial_delay_ms = 100
jitter_ms = 50

Usage Example

from protean import retry

@retry(retry_strategy='exponential_backoff', max_retries=5, initial_delay=100, jitter=50)
def handle_command(command):
    ...

Notes

  • The thread will be blocked until the retry mechanisms are completed. Should we use async mechanisms and place the request again (with metadata) on the event loop?
@subhashb subhashb added enhancement New feature or request architecture Needs Architecture Discussion proposal labels May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
architecture Needs Architecture Discussion enhancement New feature or request proposal
Projects
None yet
Development

No branches or pull requests

1 participant