Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Async Support #49

Merged
merged 48 commits into from
Oct 9, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
a3c36ed
add test for api - test not passing
biniona Sep 28, 2024
2e52bc9
add more tests to pass
biniona Sep 28, 2024
97eab3c
wip - broken
biniona Sep 29, 2024
772dc1f
passing all tests. Very messy but technical works as of this commit.
biniona Sep 29, 2024
08b3348
more tests
biniona Sep 29, 2024
50ac88c
pre-commit formatting
biniona Sep 29, 2024
16336bc
add failing test - too many parent instantiations
biniona Sep 30, 2024
6108222
only instantiate class once
biniona Sep 30, 2024
2cf7794
proofread
biniona Sep 30, 2024
4ce0c74
typing errors
biniona Sep 30, 2024
41c9a4e
fix python3.7 error
biniona Sep 30, 2024
838d369
add hack type checking
biniona Sep 30, 2024
173156c
fix type error
biniona Sep 30, 2024
a47911c
simplify aget
biniona Oct 1, 2024
08400a9
docs
biniona Oct 1, 2024
6f8bd63
tests + docs + refactor
biniona Oct 1, 2024
2e24f28
add test + async plugin
biniona Oct 1, 2024
0d2245c
docs
biniona Oct 1, 2024
82a1332
refactor interface check
biniona Oct 1, 2024
d2d655a
typing error
biniona Oct 1, 2024
b53d559
refactor
biniona Oct 1, 2024
22c73fb
proofread
biniona Oct 1, 2024
f00b433
changelog + version + refactor
biniona Oct 1, 2024
5f57733
add test + fix bug
biniona Oct 1, 2024
9ab7706
bcm - feedback
biniona Oct 2, 2024
77af2de
bcm - feedback
biniona Oct 2, 2024
234b9c9
fix type error
biniona Oct 2, 2024
da92ce9
xs - feedback
biniona Oct 2, 2024
2736a50
fix bug in define + test
biniona Oct 3, 2024
db4b93d
bcm - reduce indirection
biniona Oct 6, 2024
991f9f5
bcm + xs - asyncio.to_thread at async -> sync handoff points
biniona Oct 6, 2024
16c1db4
bcm - async_can_proceed -> async_entered
biniona Oct 6, 2024
48056b8
bcm - is_async_context to attribute + init arg
biniona Oct 7, 2024
acba95d
bcm - move is key async check to get
biniona Oct 7, 2024
5e2d462
bcm - RegistryAPIError -> AssertionError
biniona Oct 7, 2024
5f655c5
bcm - str check suggestion
biniona Oct 7, 2024
1720159
Merge branch 'master' into async-support
biniona Oct 7, 2024
cfe4dac
provide fallback implementation of to_thread
biniona Oct 7, 2024
dd19c1f
remove '/' from asyncio extension (not supported in python 3.7)
biniona Oct 7, 2024
cc4c5e9
ignore python 3.7 error + docs
biniona Oct 7, 2024
ee30e3c
ignore mypy errors - redefining signature
biniona Oct 7, 2024
8e7d796
proofread comment
biniona Oct 7, 2024
f18dd14
add sync -> async failure tests
biniona Oct 7, 2024
cd13e39
bcm - call close on error
biniona Oct 9, 2024
9e64c57
bcm - __contains__ test
biniona Oct 9, 2024
dabb7da
bcm - typo + var name
biniona Oct 9, 2024
ae6c0d3
clarify comment
biniona Oct 9, 2024
537b1a2
bcm - fix comment
biniona Oct 9, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
70 changes: 67 additions & 3 deletions minject/inject.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,22 +2,36 @@

import itertools
import os
from typing import Any, Callable, Dict, Optional, Sequence, Type, TypeVar, Union, cast, overload
from typing import (
Any,
Callable,
Dict,
Optional,
Sequence,
Type,
TypeVar,
Union,
cast,
overload,
)

from typing_extensions import TypeGuard

from .metadata import RegistryMetadata, _gen_meta, _get_meta
from minject.types import _AsyncContext

from .metadata import _INJECT_METADATA_ATTR, RegistryMetadata, _gen_meta, _get_meta
from .model import (
Deferred,
DeferredAny,
RegistryKey, # pylint: disable=unused-import
Resolver,
resolve_value,
)
from .types import _MinimalMappingProtocol
from .types import _AsyncContext, _MinimalMappingProtocol

T = TypeVar("T")
T_co = TypeVar("T_co", covariant=True)
T_async_context = TypeVar("T_async_context", bound=_AsyncContext)
R = TypeVar("R")


Expand Down Expand Up @@ -70,6 +84,19 @@ def wrap(cls: Type[T]) -> Type[T]:
return wrap


def async_context(cls: Type[T_async_context]) -> Type[T_async_context]:
"""
Decorator to declare that a class is as an async context manager
Copy link
Contributor

@xiyan128 xiyan128 Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "declare that a class is an async context manager"

that can be initialized by the registry through aget(). This
is to distinguish the class from an async context manager that
should not be initialized by the registry (an example of
this being asyncio.Lock()).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Drop the parens after asyncio.Lock. (asyncio.Lock is the type; asyncio.Lock() is its constructor function.)

"""
meta = _gen_meta(cls)
meta.update_async_context(True)
return cls


def define(
base_class: Type[T],
_close: Optional[Callable[[T], None]] = None,
Expand All @@ -91,6 +118,24 @@ def _is_type(key: "RegistryKey[T]") -> TypeGuard[Type[T]]:
return isinstance(key, type)


def _is_key_async(key: RegistryKey) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not sure) It looks like RegistryKey is a parameterized type — does it need a type parameter here, or do Python type-checkers implicitly assume unconstrained parameters?

Copy link
Contributor Author

@biniona biniona Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eye! I am not sure either. I ran a little test and the following code snippet passes mypy <filename> but fails mypy --disallow-any-generics <filename> with a error: Missing type parameters for generic type "GenericType" error:

from typing import Generic, TypeVar

T = TypeVar('T')

class GenericType(Generic[T]):
    def __init__(self, value: T) -> None:
        self.value = value

    def get_value(self) -> T:
        return self.value

def test_typing(g : GenericType) -> bool:
    return bool(g.get_value())

Adding a param to GenericType in test_typing passes both type checks.

My guess is that by default Generic types default to specifying an Any parameter based on the naming of the disallow_any_generics option.

I'll update the code to explicitly specify T (I'm not sure it's much more useful than Any but it does communicate a relationship between the RegistryMetadata type and the RegistryKey type).

Update: Python 3.7 does not support subscripted generics in isinstance checks. Leaving TypeVar in however as I don't think it hurts anything.

"""
Check whether a registry key is an async context manager
marked for initialization within the registry (@async_context).
"""
if isinstance(key, str):
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment for the str case here would be helpful. For comparison, _get_meta_from_key explicitly disallows string keys, and the call site in aget subsequently disallows string keys as well.


Actually, does this need to be a separate function at all? In get we're calling either _unwrap or _get_meta_from_key.

  • In the _unwrap case, we can check the _meta field of the returned RegistryWrapper.
  • In the _get_meta_from_key case, we can check meta.is_async_context.

So it seems like we could inline this into aresolve for now, and potentially remove it entirely if we allow registry_impl._aresolve to also resolve non-async targets.

Copy link
Contributor Author

@biniona biniona Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed the extra str check in aget, and added documentation on the str check in _is_key_async.

The larger edit to remove _is_key_async seems reasonable, however, I'd like to defer making this edit due to time time constraints surrounding the project.

I've made an issue documenting this suggestion, I think it could make sense to group this with other Refactor suggestions and go through them together when we have some free time (most likely next grease-week/hackathon).

elif isinstance(key, RegistryMetadata):
return key.is_async_context()
elif isinstance(key, type):
try: # type: ignore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments explaining the reasons for the type: ignore annotations would be helpful.

(Could we avoid the need for the tryexcept block by using _get_meta() or _gen_meta() here?)

inject_metadata = object.__getattribute__(key, _INJECT_METADATA_ATTR)
return inject_metadata.is_async_context() # type: ignore
except AttributeError:
return False
assert False, f"Unexpected key type: {key}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Maybe raise a TypeError instead of AssertionError, since the former is more specific?

Or, better still, change the last elif to an else and use typing.assert_type(key, type)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much cleaner!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using assert for control flow in the library can lead to issues since assertions can be disabled with the -O flag, which may result in this check being overlooked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #49 (comment)



class _RegistryReference(Deferred[T_co]):
"""Reference to an object in the registry to be loaded later.
(you should not instantiate this class directly, instead use the
Expand All @@ -103,6 +148,13 @@ def __init__(self, key: "RegistryKey[T_co]") -> None:
def resolve(self, registry_impl: Resolver) -> T_co:
return registry_impl.resolve(self._key)

async def aresolve(self, registry_impl: Resolver) -> T_co:
if _is_key_async(self._key):
result = await registry_impl.aresolve(self._key)
await registry_impl.push_async_context(result)
return result
return registry_impl.resolve(self._key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assumes that the all Resolver.resolve implementations are strictly non-blocking, but there is currently no such constraint documented on that method.

Should we call this using asyncio.to_thread instead, or perhaps require the _aresolve method implementation itself to support async keys?

Copy link
Contributor Author

@biniona biniona Oct 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting problem, I'm leaning toward "don't run in asyncio.to_thread" but I think I could be swayed in either direction.

Here is my thinking:

I think minject strives to be as "unmagical" as possible. I think asyncio.to_thread could fix an issue the user added to their code "run blocking code in event loop". I think this would potentially make the user unaware of the issue at initialization (minject would fix/hide the problem), and allow the user to have incorrect async/sync interactions at runtime.

Do people think this makes sense, or does this seem a bit off?

I'm not sure what I think anymore. Gonna sit on this for a bit 🤔.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xiyan128 also commented on this here: #49 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think calling resolve can block in general. RegistryFunction.resolve() resolves its args, and then invokes its deferred constructor with the argument values. Since the key is not async, we must assume that that constructor could block.

The alternative, for now, would be to have aresolve throw if self._key is not async, but I don't know whether or how that interacts with deferred str keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved all synchronous resolve + close calls into asyncio.to_thread when called from a* functions/methods.

I'm slightly nervous that this could encourage people to run blocking code within their async classes (minject initialization will suggest that a async -> sync call causes no issues as minject is handling the asyncio.to_thread call on the users behalf), but overall I think this is correct as there as I'm not sure you'd be able to properly do async -> sync initialization without this feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: when the key is not async, we still resolve it by calling the blocking function. Not sure if we discussed it somewhere else, but should we

  1. raise an error because the function is used in an unexpected way or
  2. do something like await asyncio.to_thread(registry_impl.resolve, self._key) to make this really async?

*realized we do throw RegistryAPIError for aget, and so this probably for implementation purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this discussion with #49 (comment) . I think these are largely getting at the same thing. Let me know if you disagree though!


@property
def type_of_object_referenced_in_key(self) -> "Type[T_co]":
if type(self.key) == RegistryMetadata:
Expand Down Expand Up @@ -188,6 +240,9 @@ def resolve(self, registry_impl: Resolver) -> T_co:
kwargs[key] = resolve_value(registry_impl, arg)
return self.func()(*args, **kwargs)

async def aresolve(self, registry_impl: Resolver) -> T_co:
raise NotImplementedError("Have not implemented async registry function")

def func(self) -> Callable[..., T_co]:
return self._func

Expand Down Expand Up @@ -273,6 +328,9 @@ def resolve(self, registry_impl: Resolver) -> T_co:
else:
return cast(T_co, self._default)

async def aresolve(self, registry_impl: Resolver) -> T_co:
return self.resolve(registry_impl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not sure) Can resolve block due to deferred initialization in evaluation of registry_impl.config[self._key]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging this discussion with similar discussion: #49 (comment)


@property
def key(self) -> Optional[str]:
return self._key
Expand Down Expand Up @@ -316,6 +374,9 @@ def resolve(self, registry_impl: Resolver) -> T_co:
return self._default
return cast(T_co, sub)

async def aresolve(self, registry_impl: Resolver) -> T_co:
return self.resolve(registry_impl)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not sure) Can the evaluation of sub[key] in resolve block due to executing a deferred initializer?

Copy link
Contributor Author

@biniona biniona Oct 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure either. I don't think I have the bandwidth/necessary system call knowledge to investigate this in detail, so I have wrapped all async -> sync handoff points in asyncio.to_thread calls to try and remove the chance of this causing issues.



def nested_config(
keys: Union[Sequence[str], str], default: Union[T, _RaiseKeyError] = RAISE_KEY_ERROR
Expand Down Expand Up @@ -375,5 +436,8 @@ class _RegistrySelf(Deferred[Resolver]):
def resolve(self, registry_impl: Resolver) -> Resolver:
return registry_impl

async def aresolve(self, registry_impl: Resolver) -> Resolver:
return self.resolve(registry_impl)


self_tag = _RegistrySelf()
24 changes: 24 additions & 0 deletions minject/metadata.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ def __init__(

self._close = close
self._interfaces = [cls for cls in inspect.getmro(cls) if cls is not object]
self._is_async_context = False

@property
def interfaces(self) -> Sequence[Type]:
Expand Down Expand Up @@ -142,6 +143,18 @@ def update_bindings(self, **bindings: DeferredAny) -> None:
# TODO: 'lock' the bindings once added to the registry to make above note unnecessary
self._bindings.update(bindings)

def update_async_context(self, is_async_context: bool) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this and the next function just update is_async_context. Should we make it a property or a getter/setter?

"""
Set the async context flag for this metadata.
"""
self._is_async_context = is_async_context

def is_async_context(self) -> bool:
"""
Returns the value of the async context flag for this metadata.
"""
return self._is_async_context

def _new_object(self) -> T_co:
return self._cls.__new__(self._cls)

Expand All @@ -159,6 +172,17 @@ def _init_object(self, obj: T_co, registry_impl: "Registry") -> None: # type: i

self._cls.__init__(obj, **init_kwargs)

async def _ainit_object(self, obj: T_co, registry_impl: "Registry") -> None: # type: ignore[misc]
"""
asynchronous version of _init_object. Calls _aresolve instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_init_object itself doesn't have a doc comment, so more detail in this doc comment would be helpful. In particular: if the object is an asynchronous context manager, will it be entered? (Looks like that's left up to the caller.)

Copy link
Contributor Author

@biniona biniona Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made an issue to improve documentation for the _init_object method. Would prefer to defer as many changes as possible from this PR in order to try and meet some tight deadlines dependent on this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in general we can defer fixes for future maintainability, but In this case I'm requesting the doc comment in order to understand the code as a reviewer right now.

I'm having a hard time following the invariants around when the async objects are in what state, and having a more complete picture of what it means for an object to be “initialized” would help me better understand the rest of the code without having to spend as much time reverse-engineering it.

After spending some time digging into it, it appears that _ainit_object doesn't enter the context, as _push_async_context is called explicitly afterward.

of _resolve.
"""
init_kwargs = {}
for name_, value in self._bindings.items():
# the specific deferred value checks if they are async or
init_kwargs[name_] = await registry_impl._aresolve(value)
self._cls.__init__(obj, **init_kwargs)

def _close_object(self, obj: T_co) -> None: # type: ignore[misc]
if self._close:
self._close(obj)
Expand Down
36 changes: 36 additions & 0 deletions minject/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,23 @@ class Resolver(abc.ABC):
def resolve(self, key: "RegistryKey[T]") -> T:
...

@abc.abstractmethod
async def aresolve(self, key: "RegistryKey[T]") -> T:
"""
Resolve a key into an instance of that key. The key must be marked
with the @async_context decorator.
"""
...

@abc.abstractmethod
async def push_async_context(self, key: Any) -> Any:
"""
Push an async context onto the context stack maintained by the Resolver.
This is necessary to enter/close the context of an object
marked with @async_context.
"""
...

@property
@abc.abstractmethod
def config(self) -> RegistryConfigWrapper:
Expand All @@ -52,6 +69,15 @@ class Deferred(abc.ABC, Generic[T_co]):
def resolve(self, registry_impl: Resolver) -> T_co:
...

@abc.abstractmethod
async def aresolve(self, registry_impl: Resolver) -> T_co:
"""
Resolve a deferred object into an instance of the object. The object,
may or may not be asynchronous. If the object is asynchronous (marked with @async_context),
resolve the object asynchronously. Otherwise, resolve synchronously.
"""
...


Resolvable = Union[Deferred[T_co], T_co]
# Union of Deferred and Any is just Any, but want to call out that a Deffered is quite common
Expand All @@ -69,3 +95,13 @@ def resolve_value(registry_impl: Resolver, value: Resolvable[T]) -> T:
return value.resolve(registry_impl)
else:
return value


async def aresolve_value(registry_impl: Resolver, value: Resolvable[T]) -> T:
"""
Async version of resolve_value, which calls aresolve on Deferred instances.
"""
if isinstance(value, Deferred):
return await value.aresolve(registry_impl)
else:
return value
Loading