Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Async Support #49

Merged
merged 48 commits into from
Oct 9, 2024
Merged

Add Async Support #49

merged 48 commits into from
Oct 9, 2024

Conversation

biniona
Copy link
Contributor

@biniona biniona commented Sep 28, 2024

Fixes #9.

Adds Async Support to Minject with the following API:

@async_context
@inject.bind(text = "hi")
class MyClass:
    def __init__(self, text : str):
        self.text = text
        self.in_context = False

    async def __aenter__(self : Self) -> Self:
        self.in_context = True
        return self 
    
    async def __aexit__(self : Self, exc_type : Type[BaseException], exc_value : BaseException, traceback : TracebackType) -> None:
        self.in_context = False

async def main():
    my_class : MyClass
    async with Registry() as r:
        my_class = await r.aget(MyClass)
        assert my_class.text == "hi"
        assert my_class.in_context == True
    
    assert my_class.in_context == False

This PR only adds implementation + tests. Once we agree on implementation I can add docs.

Verification Steps

  • Doc Strings
  • Created Issue to Document feature

@biniona
Copy link
Contributor Author

biniona commented Sep 29, 2024

Tests passing as of here, but there are still bugs. There is some weirdness in navigating how top-level instantiated classes don't ever call the aresolve method.

An issue that I think is still in the code is that the following code would enter the context of MyClass twice.

registry.aget(MyClass)
registry.aget(MyClass)

@biniona
Copy link
Contributor Author

biniona commented Sep 30, 2024

Starting with a really rough implementation. It's pretty much working but has a lot of problems.

I think a lot of the issues are being caused by the distinction between a binding/reference and a top level class. It's creating a lot of duplication of code.

@biniona
Copy link
Contributor Author

biniona commented Sep 30, 2024

Passing checks as of 173156c. Still a lot of clean up though.

I think to do this correctly would require a fair amount of refactoring, and may involve wrapping the initial object to retrieve from the registry in an inject.reference(MyClass) call.

@biniona
Copy link
Contributor Author

biniona commented Oct 1, 2024

All checks passing here ^

@@ -180,6 +219,32 @@ def _register_by_metadata(self, meta: RegistryMetadata[T]) -> RegistryWrapper[T]

return wrapper

async def _aregister_by_metadata(self, meta: RegistryMetadata[T]) -> RegistryWrapper[T]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a way to merge _aregister_by_metadata and _register_by_metadata using a context manager/generator, but this seemed like overkill to me.

Let me know if you disagree though.

by_meta = await self._aget_by_metadata(meta)
return _unwrap(by_meta)

def _get_class_if_registered(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: THIS METHOD IS THE ONE I AM MOST NERVOUS ABOUT IN THE PR

I added _get_class_if_registered to try and remove duplicated code in get(), _aget(), and aget() around checking if a class was already created in the registry.

This is the trickiest change to an existing implementation in this PR, and should probably be given significant attention around correctness in a review.


def test_get_sync_class_async_dependency_throws(registry: Registry) -> None:
with pytest.raises(AssertionError):
_ = registry.get(MySyncClassWithAsyncDependency, AUTO_OR_NONE)
Copy link
Contributor Author

@biniona biniona Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API for registry.get is a bit odd, you need to specify AUTO_OR_NONE to get the Registry to actually to create your class instance (see here). I think it makes sense to remove get from the public API for registry, and only push people towards using Registry.__getitem__.

Any objection to this @mmchenry-duolingo?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, get is probably unnecessary

@biniona biniona marked this pull request as ready for review October 7, 2024 19:19
Copy link
Contributor

@xiyan128 xiyan128 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! It looks great to me as long as we have follow-up PRs for documentation and @bcmills & @mmchenry-duolingo approve.

# we must ignore it here.
from asyncio import to_thread # type: ignore[attr-defined]
# This is copy pasted from here: https://github.com/python/cpython/blob/03775472cc69e150ced22dc30334a7a202fc0380/Lib/asyncio/threads.py#L1-L25
except ImportError:
Copy link
Contributor

@xiyan128 xiyan128 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional): not sure if it is a common practice to use try-except for feature detection. Shall we do something like below?

if hasattr(asyncio, 'to_thread'):
    from asyncio import to_thread

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this will solve the attr-defined type error?

Copy link
Contributor Author

@biniona biniona Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool suggestion! I gave it a shot but it looks like it still raises the issue. I've seen this try/except before in base library code but I'm honestly not sure how widely used it is.

I'm alright keeping this as is for now, and if we find a better was to do it we can circle back to it.

Copy link
Contributor

@mmchenry-duolingo mmchenry-duolingo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a reasonable beta approach to async to me, it fits pretty well with the existing patterns.

@@ -159,6 +174,16 @@ def _init_object(self, obj: T_co, registry_impl: "Registry") -> None: # type: i

self._cls.__init__(obj, **init_kwargs)

async def _ainit_object(self, obj: T_co, registry_impl: "Registry") -> None: # type: ignore[misc]
"""
asynchronous version of _init_object. Calls _aresolve instead
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that in general we can defer fixes for future maintainability, but In this case I'm requesting the doc comment in order to understand the code as a reviewer right now.

I'm having a hard time following the invariants around when the async objects are in what state, and having a more complete picture of what it means for an object to be “initialized” would help me better understand the rest of the code without having to spend as much time reverse-engineering it.

After spending some time digging into it, it appears that _ainit_object doesn't enter the context, as _push_async_context is called explicitly afterward.

raise AssertionError("cannot use aget outside of async context")

meta = _get_meta_from_key(key)
maybe_class = self._get_if_already_in_registry(key, meta)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) The name maybe_class suggests a class object, but from the return type of aget I expect this to be an instance of the class. I had to spend a while reading through the code to figure out which was the case — a clearer name would be helpful here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch.

Changed maybe_class to maybe_initialized_obj. Changed by_meta below to initialized_obj to highlight that this method is trying to give you usable objects from keys.

@@ -177,6 +205,38 @@ def _register_by_metadata(self, meta: RegistryMetadata[T]) -> RegistryWrapper[T]

return wrapper

async def _aregister_by_metadata(self, meta: RegistryMetadata[T]) -> RegistryWrapper[T]:
"""
async version of _register_by_metadata. Calls _aiint_object instead of _init_object
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(typo) s/_aiint/_ainit/

@@ -86,9 +102,35 @@ def config(self) -> RegistryConfigWrapper:
def resolve(self, key: "RegistryKey[T]") -> T:
return self[key]

async def _aresolve(self, key: "RegistryKey[T]") -> T:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to not clean this up as part of this PR however in order to keep the PR focused and to increase the probability of hitting deadlines dependent on this PR.

In this case the reason I am asking for a doc comment is so that I can understand the code well enough to review it: I can complete the review more quickly if I don't have to spend as much time figuring out what everything does.

Here's what I think I've got so far:

  • _aresolve calls aget
  • aget calls _aget_by_metadata
  • _aget_by_metadata calls _aregister_by_metadata
  • _aregister_by_metadata calls _ainit_object
  • _ainit_object calls aresolve_value on the object's dependencies
  • aresolve_value calls aresolve
  • aresolve calls _aresolve (recursively).

@@ -198,6 +258,17 @@ def _get_by_metadata(
else:
return None

async def _aget_by_metadata(self, meta: RegistryMetadata[T]) -> Optional[RegistryWrapper[T]]:
"""
async version of _get_by_metadata. The default argument has been removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little unclear on the relationship between _aget_by_metadata and _get_by_metadata. I see three calls to the latter (two in get and one in contains), but only one to the former (in aget).

I guess that's related in some way to the lack of a default argument?

I'm having a hard time seeing how __contains__ works for async objects — is _get_by_metadata also intended to be used for pre-registered async objects?

Copy link
Contributor Author

@biniona biniona Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely wasn't thinking about __contains__. Great catch.

I believe __contains__ works as expected (added test to confirm this as I wasn't sure), but I agree this is hard to reason about.

I'll make a refactor issue to document this confusion. I think some of the trickiness here stems from the fact that _get_by_metadata is used for checking existence and creating things that don't yet exist. I think a possible refactor to split this up might be worthwhile. This would also make it clearer which parts of the code actually need to have sync/async alternates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is here: #56

if not isinstance(key, type):
return None

# If the class has no metadata, but a parent has been registered in registry,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding what's happening here. It seems like we're explicitly not checking the parents (include_bases=False)?

But then we are checking _by_iface, which will give us some type that implements (i.e. is a “child” of) T. So I think the code is correct but the comment is backwards..?

Copy link
Contributor Author

@biniona biniona Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that _by_interfaces is populated by meta.interfaces which is itself calculated as [cls for cls in inspect.getmro(cls) if cls is not object]. Based on this, I believe _by_interfaces contains the interfaces key implements (I.E. similar to parent classes), rather than objects that implement/extend key.

So I think "check parents" is close but not totally right. I've updated the comment to be more explicit about "an object exists in the registry that implements an interface the class implements...".

Agree this is all very tricky. Trying to think of a refactor idea to spin off this conversation. Unsure of one but if I can think of one I'll add an issue.

Copy link
Contributor Author

@biniona biniona Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chatted about this with @bcmills, and my understanding of the _by_iface look-up is most likely flawed. I was assuming that the _by_iface lookup looks for objects THAT implement the interfaces key implements, however, we should be looking for concrete subtypes of key. This makes more sense, as a key subtype could be swapped for the key type.

I'll update the comment to fix this after I've reread the relevant code (that I previously misread).

)
self._async_entered = False
# close all objects in the registry
await self._async_context_stack.aclose()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if any of the __aexit__ methods on the stack ends up raising an exception? (Does the call to self.close need to go in a finally: block?)

Copy link
Contributor Author

@biniona biniona Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, agree this should be in a finally block. I'll add a test for this now to ensure that close methods are called if error is thrown during closing async context.

@biniona biniona assigned bcmills and unassigned bcmills Oct 9, 2024
@biniona biniona merged commit a5b216a into master Oct 9, 2024
4 checks passed
@biniona biniona mentioned this pull request Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Add async API for minject
5 participants