-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-115999: Specialize loading attributes from modules in free-threaded builds #127711
Conversation
Refleaks build failures for the default build are preexisting. The free-threaded failure looks like its caused by this PR. |
…y_decrefs_once Refactor the test so that the specialized and unspecialized implementation of loading the argument to sys.getrefcount use the same refcounting approach. Previously, the argument would be evaluated by loading an attribute from a module. This specializes to LOAD_ATTR_MODULE. In free-threaded builds the unspecialized form of LOAD_ATTR always creates a new reference for its result, while the specialized form does not create a reference if the result uses deferred refcounting. This causes a difference in the result returned from sys.getrefcount, depending on whether or not the bytecode has been specialized (e.g. on runs > 1 in refleak tests). The refactored version uses LOAD_GLOBAL, whose specialized and unspecialized forms both do not create references when the result uses deferred refcounting. Also refactor the test to handle the difference in the result returned from sys.getrefcount in default builds (includes the temporary reference on the operand stack) and free-threaded builds (no temporary reference is created for deferred values).
!buildbot nogil refleak |
🤖 New build scheduled with the buildbot fleet by @mpage for commit 550f955 🤖 The command will test the builders whose names match following regular expression: The builders matched are:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -665,19 +667,46 @@ def test_c_subclass_of_heap_ctype_with_del_modifying_dunder_class_only_decrefs_o | |||
del subclass_instance | |||
|
|||
# Test that setting __class__ modified the reference counts of the types | |||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of an aside, but I'm skeptical of the value of this test considering the increasing complexity. If we want to make sure we don't decref too many times, we should test that we don't crash. Depending on the exact refcounts feels too likely to break with implementation changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Checking that we don't crash sounds like a better approach to me. Filed #127881 to track that.
!buildbot AMD64 Android |
Buildbot failures look unrelated to this PR:
|
…hreaded builds (python#127711) We use the same approach that was used for specialization of LOAD_GLOBAL in free-threaded builds: _CHECK_ATTR_MODULE is renamed to _CHECK_ATTR_MODULE_PUSH_KEYS; it pushes the keys object for the following _LOAD_ATTR_MODULE_FROM_KEYS (nee _LOAD_ATTR_MODULE). This arrangement avoids having to recheck the keys version. _LOAD_ATTR_MODULE is renamed to _LOAD_ATTR_MODULE_FROM_KEYS; it loads the value from the keys object pushed by the preceding _CHECK_ATTR_MODULE_PUSH_KEYS at the cached index.
This is the first in a series of PRs to specialize the
LOAD_ATTR
family of opcodes in free-threaded builds. PRs specializingLOAD_ATTR
for instance and class receivers will follow.We use the same approach that was used for specialization of
LOAD_GLOBAL
in free-threaded builds:_CHECK_ATTR_MODULE
is renamed to_CHECK_ATTR_MODULE_PUSH_KEYS
; it pushes the keys object for the following_LOAD_ATTR_MODULE_FROM_KEYS
(nee_LOAD_ATTR_MODULE
). This arrangement avoids having to recheck the keys version._LOAD_ATTR_MODULE
is renamed to_LOAD_ATTR_MODULE_FROM_KEYS
; it loads the value from the keys object pushed by the preceding_CHECK_ATTR_MODULE_PUSH_KEYS
at the cached index.A few other changes were necessary to support this arrangement:
POP_DEAD_INPUTS()
to the cases generator. It removes dead inputs from the stack and updates the stack pointer. This is used in_LOAD_ATTR_MODULE_FROM_KEYS
to remove the keys object from the stack before deopting._LOAD_ATTR_MODULE
. This is used by the tier2 optimizer when we successfully erase a_CHECK_ATTR_MODULE_PUSH_KEYS
but cannot convert the conjugate_LOAD_ATTR_MODULE_FROM_KEYS
into aLOAD_CONST
.specialize
andunspecialize
helpers. This isn't strictly necessary, but it makes the implementation of gradually adding specialization support cleaner.test_c_subclass_of_heap_ctype_with_del_modifying_dunder_class_only_decrefs_once
so that it works with deferred refcounting in the free-threaded build. Also refactored the test so that the evaluation of the argument expression to sys.getrefcount always uses deferred refcounting regardless of whether or not it's specialized.Thread Safety
Specialization
A critical section on the module's dictionary is held during specialization. This prevents concurrent mutation of the dictionary and its keys version.
_PyDict_GetKeysVersionForCurrentState
is used to assign the keys version, ensuring that the keys object is freed using QSBR.UOPS
md_dict
field, but not the dictionary itself, is immutable.Single-threaded Performance
Scaling Benchmarks
This improves the scaling of calls to module functions (2.1x slower -> 13.9x faster), while other benchmarks appear unchanged.
--disable-gil
builds #115999