Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-126868: Add freelist for compact int objects #126865

Merged
merged 25 commits into from
Dec 13, 2024
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
0713034
Add freelist of compact int objects
eendebakpt Nov 15, 2024
be58ade
fix build
eendebakpt Nov 15, 2024
3f50b54
cleanup freelist at exit
eendebakpt Nov 15, 2024
fa97302
fix memory leak; align with float implementation
eendebakpt Nov 16, 2024
d72486f
remove stale comment
eendebakpt Nov 16, 2024
e07c218
remove unused function
eendebakpt Nov 16, 2024
328e0c1
avoid some problematic decrefs
eendebakpt Nov 16, 2024
e1dc2b3
jit build
eendebakpt Nov 16, 2024
9df776b
📜🤖 Added by blurb_it.
blurb-it[bot] Nov 16, 2024
6b73046
Merge branch 'main' into int_freelist
eendebakpt Nov 16, 2024
db8247e
Apply review suggestions
Fidget-Spinner Nov 20, 2024
d1e4aa2
Fixup
Fidget-Spinner Nov 20, 2024
644e85a
review comments part 1
eendebakpt Nov 21, 2024
9f86b6e
review comments part 2
eendebakpt Nov 21, 2024
88274d6
Merge branch 'main' into int_freelist
eendebakpt Nov 21, 2024
1e548bd
review suggestion
eendebakpt Nov 23, 2024
13bc3e9
Merge branch 'main' into int_freelist
eendebakpt Dec 2, 2024
60220c0
add small int check to _PyLong_ExactDealloc
eendebakpt Dec 2, 2024
593d621
add COMPARE_OP_INT
eendebakpt Dec 2, 2024
efde111
review comment
eendebakpt Dec 7, 2024
928e912
Merge branch 'main' into int_freelist
eendebakpt Dec 7, 2024
437c24c
regenerate
eendebakpt Dec 7, 2024
b034948
Update Misc/NEWS.d/next/Core_and_Builtins/2024-11-16-22-37-46.gh-issu…
eendebakpt Dec 7, 2024
19f64f6
Merge branch 'main' into int_freelist
eendebakpt Dec 12, 2024
14681c1
Merge branch 'main' into int_freelist
eendebakpt Dec 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Include/internal/pycore_freelist_state.h
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ extern "C" {
# define Py_dicts_MAXFREELIST 80
# define Py_dictkeys_MAXFREELIST 80
# define Py_floats_MAXFREELIST 100
# define Py_ints_MAXFREELIST 100
# define Py_slices_MAXFREELIST 1
# define Py_contexts_MAXFREELIST 255
# define Py_async_gens_MAXFREELIST 80
Expand All @@ -35,6 +36,7 @@ struct _Py_freelist {

struct _Py_freelists {
struct _Py_freelist floats;
struct _Py_freelist ints;
struct _Py_freelist tuples[PyTuple_MAXSAVESIZE];
struct _Py_freelist lists;
struct _Py_freelist dicts;
Expand Down
2 changes: 2 additions & 0 deletions Include/internal/pycore_long.h
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ extern void _PyLong_FiniTypes(PyInterpreterState *interp);

/* other API */

PyAPI_FUNC(void) _PyLong_ExactDealloc(PyObject *self);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need it to be exported like that or could you live with a simple extern? If not, can you explain which file needs this export?

Copy link
Member

@Fidget-Spinner Fidget-Spinner Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed by the JIT on Windows, as it's used in bytecodes.c. This is a pretty common pattern in the internal C API unfortunately

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For semantic purposes, would it be better to have a macro named differently? (it would be a simple alias but it could help semantics and reviewers)


#define _PyLong_SMALL_INTS _Py_SINGLETON(small_ints)

// _PyLong_GetZero() and _PyLong_GetOne() must always be available
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Increase performance of int by adding a freelist for compact ints.
eendebakpt marked this conversation as resolved.
Show resolved Hide resolved
47 changes: 37 additions & 10 deletions Objects/longobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
#include "pycore_bitutils.h" // _Py_popcount32()
#include "pycore_initconfig.h" // _PyStatus_OK()
#include "pycore_call.h" // _PyObject_MakeTpCall
#include "pycore_freelist.h" // _Py_FREELIST_FREE(), _Py_FREELIST_POP()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment alignment.

#include "pycore_long.h" // _Py_SmallInts
#include "pycore_object.h" // _PyObject_Init()
#include "pycore_runtime.h" // _PY_NSMALLPOSINTS
Expand Down Expand Up @@ -42,7 +43,7 @@ static inline void
_Py_DECREF_INT(PyLongObject *op)
{
assert(PyLong_CheckExact(op));
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor)PyObject_Free);
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor) _PyLong_ExactDealloc);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor) _PyLong_ExactDealloc);
_Py_DECREF_SPECIALIZED((PyObject *)op, (destructor)_PyLong_ExactDealloc);

I'm also not sure whether you need the cast. If you need the cast, I'd suggest changing the signature of the drstructor itself.

}

static inline int
Expand Down Expand Up @@ -220,11 +221,16 @@ _PyLong_FromMedium(sdigit x)
{
assert(!IS_SMALL_INT(x));
assert(is_medium_int(x));
/* We could use a freelist here */
PyLongObject *v = PyObject_Malloc(sizeof(PyLongObject));

// The small int cache is incompatible with _Py_NewReference which is called
// by _Py_FREELIST_POP.
PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP_MEM(ints);
Copy link
Member

@markshannon markshannon Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you go back to _Py_FREELIST_POP and avoid reinitializing the object on line 237. We don't need to re-assign the ob_type field in objects taken from the freelist. Something like:

    PyLongObject *v = (PyLongObject *)_Py_FREELIST_POP(ints);
    if (v == NULL) {
        v = PyObject_Malloc(sizeof(PyLongObject));
        if (v == NULL) {
            PyErr_NoMemory();
            return NULL;
        }
        _PyObject_Init((PyObject*)v, &PyLong_Type);
    }
    digit abs_x = x < 0 ? -x : x;
    _PyLong_SetSignAndDigitCount(v, x<0?-1:1, 1);
    v->long_value.ob_digit[0] = abs_x;
    return (PyObject*)v;

if (v == NULL) {
PyErr_NoMemory();
return NULL;
v = PyObject_Malloc(sizeof(PyLongObject));
if (v == NULL) {
PyErr_NoMemory();
return NULL;
}
}
digit abs_x = x < 0 ? -x : x;
_PyLong_SetSignAndDigitCount(v, x<0?-1:1, 1);
Expand Down Expand Up @@ -3611,24 +3617,45 @@ long_richcompare(PyObject *self, PyObject *other, int op)
Py_RETURN_RICHCOMPARE(result, 0, op);
}

void
_PyLong_ExactDealloc(PyObject *self)
{
if (_PyLong_IsCompact((PyLongObject *)self)) {
_Py_FREELIST_FREE(ints, self, PyObject_Free);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that we didnt check for if it's a smallint cached value here. It should be rejected there

Copy link
Member

@Fidget-Spinner Fidget-Spinner Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind. It's not needed because (theoretically) the refcount of a smallint never hits 0 due to it being immortal, so this code will never be called on it. Can you add an assert to be safe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also means my comment about the small ints was wrong. Could you remove that please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment has been adapted. If the refcount can never become zero, then the code in long_dealloc checking for small ints can be removed (perhaps also with an assert). Since this is something unrelated to the freelists I will make a separate PR for that and rebase this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fidget-Spinner I think your first remark was right after all and we do need to check for smallints (at least in the 32-bit stable ABI build). I will add the check once there is consensus on #127120 (which might very well be the status quo)

return;
}
PyObject_Free(self);
}

static void
long_dealloc(PyObject *self)
{
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*/
PyLongObject *pylong = (PyLongObject*)self;
Fidget-Spinner marked this conversation as resolved.
Show resolved Hide resolved

if (pylong && _PyLong_IsCompact(pylong)) {
stwodigits ival = medium_value(pylong);
if (IS_SMALL_INT(ival)) {
PyLongObject *small_pylong = (PyLongObject *)get_small_int((sdigit)ival);
if (pylong == small_pylong) {
/* This should never get called, but we also don't want to SEGV if
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*/
* we accidentally decref small Ints out of existence. Instead,
* since small Ints are immortal, re-set the reference count.
*/

// can we remove the next two lines? the immortal objects now have a fixed refcount
// in particular in the free-threading build this seeems safe
_Py_SetImmortal(self);
return;
}
}
}

if (PyLong_CheckExact(self)) {
if (_PyLong_IsCompact((PyLongObject *)self)) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fidget-Spinner The _PyLong_IsCompact check has already been done in this method, can we move this into the if (pylong && _PyLong_IsCompact(pylong)) part?

Also, can we remove the pylong && part? I think pylong can never be NULL. (PyLong_CheckExact assumes the pointer is not NULL I think)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that sounds good

_Py_FREELIST_FREE(ints, self, PyObject_Free);
return;
}
}

Py_TYPE(self)->tp_free(self);
}

Expand Down Expand Up @@ -6615,7 +6642,7 @@ PyTypeObject PyLong_Type = {
0, /* tp_init */
0, /* tp_alloc */
long_new, /* tp_new */
PyObject_Free, /* tp_free */
(freefunc)PyObject_Free, /* tp_free */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we align the .tp_free comment? (maybe tabs and spaces are mixed, hence that's why I see them unaligned on monile). Also, is the cast necessary to avoid UBSan failures? If not, you can just use .tp_free = ... to emphasize the semantic

.tp_vectorcall = long_vectorcall,
.tp_version_tag = _Py_TYPE_VERSION_INT,
};
Expand Down
1 change: 1 addition & 0 deletions Objects/object.c
Original file line number Diff line number Diff line change
Expand Up @@ -906,6 +906,7 @@ _PyObject_ClearFreeLists(struct _Py_freelists *freelists, int is_finalization)
clear_freelist(&freelists->object_stack_chunks, 1, PyMem_RawFree);
}
clear_freelist(&freelists->unicode_writers, is_finalization, PyMem_Free);
clear_freelist(&freelists->ints, is_finalization, free_object);
}

/*
Expand Down
21 changes: 11 additions & 10 deletions Python/bytecodes.c
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include "pycore_pyerrors.h" // _PyErr_GetRaisedException()
#include "pycore_pystate.h" // _PyInterpreterState_GET()
#include "pycore_range.h" // _PyRangeIterObject
#include "pycore_long.h" // void _PyLong_ExactDealloc(PyLongObject *op);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we align the // (on mobile it seems 1 char off)?

#include "pycore_setobject.h" // _PySet_NextEntry()
#include "pycore_sliceobject.h" // _PyBuildSlice_ConsumeRefs
#include "pycore_tuple.h" // _PyTuple_ITEMS()
Expand Down Expand Up @@ -514,8 +515,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Multiply((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -527,8 +528,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Add((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -540,8 +541,8 @@ dummy_func(

STAT_INC(BINARY_OP, hit);
PyObject *res_o = _PyLong_Subtract((PyLongObject *)left_o, (PyLongObject *)right_o);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(right, (destructor)_PyLong_ExactDealloc);
PyStackRef_CLOSE_SPECIALIZED(left, (destructor)_PyLong_ExactDealloc);
INPUTS_DEAD();
ERROR_IF(res_o == NULL, error);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand Down Expand Up @@ -797,7 +798,7 @@ dummy_func(
PyObject *res_o = PyList_GET_ITEM(list, index);
assert(res_o != NULL);
Py_INCREF(res_o);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(list_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -817,7 +818,7 @@ dummy_func(
DEOPT_IF(Py_ARRAY_LENGTH(_Py_SINGLETON(strings).ascii) <= c);
STAT_INC(BINARY_SUBSCR, hit);
PyObject *res_o = (PyObject*)&_Py_SINGLETON(strings).ascii[c];
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(str_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand All @@ -838,7 +839,7 @@ dummy_func(
PyObject *res_o = PyTuple_GET_ITEM(tuple, index);
assert(res_o != NULL);
Py_INCREF(res_o);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(tuple_st);
res = PyStackRef_FromPyObjectSteal(res_o);
Expand Down Expand Up @@ -950,7 +951,7 @@ dummy_func(
PyList_SET_ITEM(list, index, PyStackRef_AsPyObjectSteal(value));
assert(old_value != NULL);
Py_DECREF(old_value);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)PyObject_Free);
PyStackRef_CLOSE_SPECIALIZED(sub_st, (destructor)_PyLong_ExactDealloc);
DEAD(sub_st);
PyStackRef_CLOSE(list_st);
}
Expand Down
20 changes: 10 additions & 10 deletions Python/executor_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 10 additions & 10 deletions Python/generated_cases.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading