Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with the Stable ABI #4

Open
gvanrossum opened this issue May 9, 2023 · 39 comments
Open

Issues with the Stable ABI #4

gvanrossum opened this issue May 9, 2023 · 39 comments

Comments

@gvanrossum
Copy link

gvanrossum commented May 9, 2023

During PyCon, @hauntsaninja mentioned some surprising data: out of 400,000 PyPI projects, only about 300 are using the Stable ABI, and only two of those have a name anyone recognizes. (Shantanu could you link to those results and verify the numbers?) [UPDATE: There are better numbers further down in the thread.]

So I'd like to add the existence of the Stable ABI to the list of problems. It makes evolving certain APIs hard -- in particular, the ob_refcnt field is exposed through macros in the Stable ABI which means that for immortal objects in 3.12, we had to go through contortions to keep supporting wheels built with the 3.11 or older versions of Py_INCREF and friends.

Maybe the Stable ABI could at least be revised to not contain any macros that access object fields directly (replacing them with equivalent functions)?

@hauntsaninja
Copy link

Here's where I got that number: https://blog.trailofbits.com/2022/11/15/python-wheels-abi-abi3audit/

To get a list of auditable packages to feed into abi3audit, I used PyPI’s public BigQuery dataset to generate a list of every abi3-wheel-containing package downloaded from PyPI in the last 21 days:
[...]
From that query, I got 357 packages, which I’ve uploaded as a GitHub Gist.

Not sure how much the "downloaded in the last 21 days" affects things. From a single pass skim over the gist just now I recognised five packages.

Probably could twist abi3audit into telling you specifically how many packages would break from removing some macros.

@hauntsaninja
Copy link

cc @woodruffw

@woodruffw
Copy link

Thanks for the ping! Yeah, I got those numbers from the BigQuery dataset for the last 21 days (which was a purely arbitrary limitation that should be lifted for a full analysis).

Because of how weak the controls around the abi3 tag on wheels and extensions are, it's possible that there are other Stable ABI-built wheels on PyPI that are just tagged incorrectly (i.e. they use the Stable ABI but their wheel filename doesn't indicate so). These arguably don't matter since pip won't treat them as Stable ABI compatible without the abi3 filename tag, but that's another limitation of the query that's worth considering.

Probably could twist abi3audit into telling you specifically how many packages would break from removing some macros.

Yeah, this should be doable -- you can use the big JSON blob in that blog post to see which symbols are actually in use, and I'm open to further improvements to abi3audit itself around that 🙂

@gvanrossum
Copy link
Author

Yeah, this should be doable -- you can use the big JSON blob in that blog post to see which symbols are actually in use, and I'm open to further improvements to abi3audit itself around that 🙂

How would that be able to tell you that a given binary is or isn't using Py_DECREF?

@mattip
Copy link

mattip commented May 9, 2023

One of the big projects using the stable ABI is cryptography, here is their pypi download page. I think they would not be happy if they had to make wheels for all the versions of CPython they support on all the platforms they support. This is the flip-side of the faster (yearly) release cycle. If I recall correctly the stable ABI was one of the carrots offered to package maintainers worried about continually chasing new CPython versions.

Maybe the Stable ABI could at least be revised to not contain any macros that access object fields directly (replacing them with equivalent functions)?

It makes sense, I don't think the intent of the stable ABI was to freeze the ABI forever. Could there be an abi31 tag that would mark the next version of the stable C-API?

@woodruffw
Copy link

How would that be able to tell you that a given binary is or isn't using Py_DECREF?

It only works for symbols and not macros, unfortunately, so abi3audit can't reveal which extensions/wheels happen to use that macro (unless in doing so they leak an underlying symbol, as Py_DECREF does for _Py_DECREF). So in that particular case you can infer that Py_DECREF was used, but there's no guarantee in the general case.

It makes sense, I don't think the intent of the stable ABI was to freeze the ABI forever. Could there be an abi31 tag that would mark the next version of the stable C-API?

I'm biased as a package maintainer, but I'm personally a fan of something like this: the problem with the Stable ABI isn't the idea itself, but the restrictions that it's accumulated over the years.

A similar idea was floated on the forums a few months ago: https://discuss.python.org/t/lets-get-rid-of-the-stable-abi-but-keep-the-limited-api/18458

@encukou
Copy link
Contributor

encukou commented May 9, 2023

Stable ABI considered harmful
only two of those have a name anyone recognizes

Could we stick to facts, please?
As someone working on the stable ABI, I find the tone of this issue needlessly negative.

So I'd like to add the existence of the Stable ABI to the list of problems. It makes evolving certain APIs hard -- in particular, the ob_refcnt field is exposed through macros in the Stable ABI which means that for immortal objects in 3.12, we had to go through contortions to keep supporting wheels built with the 3.11 or older versions of Py_INCREF and friends.

AFAIK, ob_refcnt, ob_type and ob_size are the very last ones remaining. The rest of the structs are “blueprints” for initialization (not used at runtime), and the special case of Py_buffer.

This is the biggest issue in the stable ABI (which is 14 years old now). Your post suggests there are other big issues, are there? (I know about many of issues, of course, but nothing this big...)

Maybe the Stable ABI could at least be revised to not contain any macros that access object fields directly (replacing them with equivalent functions)?

Replacing them with functions would mean an abi4 -- essentially, telling everyone to recompile. It'd be a bunch of work, in CPython and packaging tools, but doable.

@gvanrossum
Copy link
Author

In this repo we're trying to just do an inventory of problems, without immediately jumping to solutions (unless the problem is small and the solution is similarly contained).

With apologies for the provocative title, it does seem that the remaining struct fields exposed by the Stable ABI, even if they are just ob_refcnt, ob_type and ob_size are a big problem, that will require elbow grease, resolve, and probably a PEP for a solution.

Looking at the discourse thread Let's get rid of the stable ABI, but keep the limited API I actually see a spectrum of opinions on how to address the problem (and some confusion between Stable ABI and Limited API), but no additional problems, so let's just say that struct fields (including the all-important ob_refcnt) are the main problem that is exacerbated by the existence of the Stable ABI.

Another problem is that the Stable ABI doesn't provide the full functionality needed by some projects, e.g. PyObjC. (This may explain that relatively few projects use it.)

That thread also confirms the rough count of abi3-using projects (around 320).

Anyway, I suggest that we stick to this for the inventory for now, unless additional problems related to the Stable ABI are uncovered. (@encukou Feel free to add the other problems you know of, maybe we'll eventually discern a pattern leading to an innovative solution.)

@encukou
Copy link
Contributor

encukou commented May 10, 2023

Most other problems of the stable ABI (and limited API) are shared with the C API in general.

Let me give some thoughts, hoping that they're useful even if they are a bit off-topic.

The C API is vast, and my approach to improving it was to find a manageable subset, and then improve its quality and make it complete enough for general usage. Limited API (defined before my time, in PEP-384) was a reasonable starting point, and the stability guarantees make a good “carrot" for people to try it out (and find its shortcomings).

That thread also confirms the rough count of abi3-using projects (around 320).

Yup. It's not big. But it's now possible to use limited API in real-world project, although core dev involvement (cryptography) or lots of hacks (PySide) help. I haven't done too much marketing (e.g. adapt the tutorial), as I still see some incompleteness to fix. It's easier to iron out issues before the whole world uses it :)

As for “only two of those have a name anyone recognizes” -- that might not change much: for a popular leaf project, releasing yearly in the 2-month RC window is not that big of a problem. I expect Stable ABI to be much more useful for the long tail of smaller extensions.


Anyway, for the inventory, I see several issues identified here. I'm not sure how you want to organize them:

  • Exposed struct fields (ob_refcnt, ob_type, ob_size) force either contortions on us or recompilations on users
  • Existence of the Stable ABI (?)
  • Incompleteness of the Stable ABI
  • Low adoption of the Stable ABI

@alex
Copy link

alex commented May 10, 2023

👋, one of the cryptography maintainers here. Happy to answer any questions about our setup, or what's important to us.

@gvanrossum
Copy link
Author

👋, one of the cryptography maintainers here. Happy to answer any questions about our setup, or what's important to us.

If we’re ever going to propose to deprecate the stable ABI I promise there will be a debate where you will be heard. But that’s not on the table now. We’re just trying to inventory problems with the C API here. So if anything about the C API bugs you, please create an issue for it! (Come to think of it, specific praise for the C API might also be useful.)

@gvanrossum gvanrossum reopened this May 10, 2023
@gvanrossum
Copy link
Author

(Closed by mistake, clumsy fingers. :-)

@wjakob
Copy link

wjakob commented May 10, 2023

I think that there would be far more stable ABI packages if it would be straightforward to compile C++ projects into stable ABI wheels. But it isn't. For one thing, the limited API has so far been too limited for it to be usable by various general C++ binding tools. That is changing now with 3.12, which has a number of relevant API calls exposed in the limited API.

Drawing a conclusion from the PyPI data point may be missing part of the big picture.

@encukou
Copy link
Contributor

encukou commented May 10, 2023

specific praise for the C API might also be useful.

Where? As issues?

Here are some quotes from my language binding survey from a while ago:

David Hewitt (PyO3) said:

Overall just wanted to say that the C-API is amazingly powerful and keeps getting better and better, so thank you!

Wenzel Jakob (pybind11/nanobind) said:

I generally find the CPython API to be very well-designed. I am convinced that this has contributed to the success of Python and its ability to accumulate a large ecosystem of powerful extension libraries.
So my request would be: please don't break it ;-).

Karl Nelson (JPype) said:

So thus far I have only been discussing "negatives" while there are a few things that I see that are very positive in the C Python API that I would like to see more of.
First up, I made the effort to convert all my C implementations to heap types use specifications, rather than the older fixed types were some static memory is being used. I really liked the move from static structure to heaps and it when fairly smoothly except for a few corner cases were static structures would allowed things that were prohibited by heap types. Those limitations required me to create the heap type then alter the slot afterward. That said I found the heap type interface to be much better and safe now the memory was controlled by Python and thus allows easier changes behind the scenes. I would like this to expand throughout the API such that rather than having things that may be a pointer or a direct reference to a static structure, everything should be pointer types only so that they are consistent and it is irrelevant if they are heap or statically typed.

@gvanrossum
Copy link
Author

Those are great! Did Karl's issues get addressed? Are the pain points from the Google Doc you linked to entered as issues here yet? (How old is that doc, and how widely did you distribute it?)

@wjakob
Copy link

wjakob commented May 10, 2023

Also, one point about the motivating number:

400,000 PyPI projects, only about 300 are using the Stable ABI

For this to be an meaningful ratio, the denominator should be the number of PyPI projects with binary extensions and not the total number of packages (of which most are pure Python code).

@gvanrossum
Copy link
Author

For this to be an meaningful ratio, the denominator should be the number of PyPI projects with binary extensions and not the total number of packages (of which most are pure Python code).

Yup, which is why I asked @hauntsaninja to verify my numbers. I'm sure there's someone here who can do some more queries and report back. (Maybe @woodruffw ?)

@encukou
Copy link
Contributor

encukou commented May 10, 2023

Some of Karl's issues are hopefully addressed in 3.12, others are reported here, a few I left with “PRs welcome” (which is rather cruel of me, as Karl can't sign Python's CLA).
Most pain points from the Google doc are now here, yes. The doc is from last year, I shared it to language binding maintainers I could easily get a hold of. (Could have shared it even more widely, but the pain points were getting repetitive and it what I would work on for for 3.12 was already clear.) My notes about sharing it are at encukou/abi3#25

@woodruffw
Copy link

woodruffw commented May 10, 2023

It's unfortunately difficult to get accurate numbers here, in part for the reasons identified in the post: wheels can be mis-tagged on PyPI and might just coincidentally be working for the overwhelming majority of users.

That being said, I can try and get the following numbers/ratios:

  1. The total number of wheel distributions on PyPI;
  2. The total number of binary wheel distributions on PyPI (at least, ones tagged as such);
  3. The total number of abi3 binary wheel distributions on PyPI (again, tagged as such);
  4. Download/popularity ratios for the above (since a tiny number of abi3 wheels, like those for cryptography, are probably responsible for a large proportion of overall binary wheel downloads)

If that sounds okay/topical, I can try and get those numbers tonight.

@gvanrossum
Copy link
Author

If that sounds okay/topical, I can try and get those numbers tonight.

That sounds great -- let's hear it for more data!

@woodruffw
Copy link

Okay, here are some queries and their numbers, as of today:

Total number of wheels:

SELECT COUNT(*) AS num_wheels
FROM `bigquery-public-data.pypi.distribution_metadata`
WHERE packagetype = "bdist_wheel"

Produces: 4739240

Total number of binary wheel distributions (calculated by searching for non-binary wheels and subtracting from the total):

SELECT COUNT(*) AS num_pure_wheels
FROM `bigquery-public-data.pypi.distribution_metadata`
WHERE packagetype = "bdist_wheel"
AND filename LIKE "%none-any%"

Produces: 3285242, meaning that there are 1453998 wheel distributions on PyPI that may contain binaries (i.e., are not explicitly tagged otherwise).

SELECT COUNT(*) AS num_abi3_wheels
FROM `bigquery-public-data.pypi.distribution_metadata`
WHERE packagetype = "bdist_wheel"
AND filename LIKE "%-abi3-%"

Produces: 24672

Total list of packages that contain abi3-tagged wheels:

SELECT DISTINCT name
FROM `bigquery-public-data.pypi.distribution_metadata`
WHERE packagetype = "bdist_wheel"
AND filename LIKE "%-abi3-%"

Produces: https://drive.google.com/file/d/1D_zFlsxVJmmqXJ4UMZrCOm0jzdC0NNSa/view?usp=sharing

(That last link should be a CSV dump of packages that contain abi3 wheels. From a quick glance, there are 553 total packages that contain at least one abi3 wheel.)

I can do some more concrete popularity statistics later (since the numbers above only indicate raw package counts, not how popular each individual package is) 🙂

@encukou encukou changed the title Stable ABI considered harmful Issues with the Stable ABI May 17, 2023
@encukou
Copy link
Contributor

encukou commented May 17, 2023

I've renamed the issue. As you said, it's provocative, and I don't think stirring up emotions is good for the discussion. Let's keep accusations of causing harm out of this forum. I'd be happy to talk privately, if you're interested (you might be, given your other Summit talk), or if you want to keep the title.


Anyway:
The limited API is incomplete. It's not the default. It's not what the tutorial shows. It takes dedication and some hacks to use. In 3.12 it almost got to the point where I'm comfortable suggesting it as a default -- but people who can limit themselves to 3.12+ don't yet get much advantage out of it.
Despite that, some people use it. Why?

My interpretation is that a stable ABI is a very useful feature. It's just our current implementation -- and marketing of it -- that is lacking.
Ask the people who use it despite the issues. They like it. There's not a lot of them, sure. But there's enough for me to keep up with the reports :)

Also: Don't only look at PyPI. Python is bigger than that.

@vstinner
Copy link
Contributor

AFAIK, ob_refcnt, ob_type and ob_size are the very last ones remainin

In Python 3.11, I prepared the C API to convert Py_REFCNT(), Py_TYPE() and Py_SIZE() to opaque function calls, rather than macro or static inline functions accessing directly structure members.

https://peps.python.org/pep-0670/ and https://peps.python.org/pep-0674/ were designed for that.

One API issue was that these 3 macros were used to modify an object reference count, type or size, and so it was not possible to convert them to opaque function calls.

Converting them to opaque function calls can have a cost on performance which deserves to be mesured. Also the benenit of such change was unclear to most people so i didn't do it. Maybe immortal objects and nogil projects make the issues more obvious?

Note: this issue scope is too broad, i suggest to open more specific issues.

@iritkatriel iritkatriel added the v label Jul 20, 2023
@malemburg
Copy link

More up-to-date numbers based on https://py-code.org/ and a ClickHouse version of the associated database:

The current count of projects shipping abi3 wheels appears to be 648.
That’s 0.138% of all PyPI projects (466520 at the moment) or
around 5.6% of all projects which publish binary wheel files on PyPI (11528 at the moment).

@malemburg
Copy link

Petr asked to leave a comment here, based on the discussion on Discourse:

His question was

If you know of any more issues where maintaining the stable ABI is more painful than API compatibility (PEP-387) or frozen per-version ABI, I’d love too hear them; ideally please comment in issue#4.

My answer

The main difference is that you cannot change the APIs in the limited API at all, without breaking the ABI. Without this limitation, the APIs could be changed subject to the normal deprecation procedures and participate in the evolution of the APIs.

And because the stable ABI doesn’t even specify an expected lifetime in years or number of releases, it means that no changes are possible until we move to 4.x.

Lifting the requirement to be stable across all 3.x versions would help with this problem, of course, but then I don’t think we’re that far off from the regular deprecation process, which also supports compatibility for at least 3 releases.

I'll see where the discussion goes on Discourse and then update this post accordingly

@vstinner
Copy link
Contributor

vstinner commented Sep 4, 2023

The main difference is that you cannot change the APIs in the limited API at all, without breaking the ABI.

Which kind of change are you thinking about? I looked at Limited C API since Python 3.2 and I found these changes, mostly API removals:
https://discuss.python.org/t/use-the-limited-c-api-for-some-of-our-stdlib-c-extensions/32465/4

When a limited C API function is removed, the function remains available at the ABI level: libpython still provides the symbol.

Recently, I proposed deprecating passing NULL as the value in PyObject_SetAttr(), since currently it does remove the attribute, and this behavior can be a bug, when the caller created an object, but the creation failed. The question was how to deal with this issue in the stable ABI? See issue: python/cpython#106572

I proposed to keep the same API, but depending on selected Py_LIMITED_API, select between the old behavior (accept NULL value) or the new behavior (NULL value raises an exception): call a different function at the ABI level. But other participant were not really convinced that it's an important issue to solve, and so I gave up on my attempt to address this issue.

The API can evolve without losing support for existing stable ABI binaries, there are technical solutions for that.

@vstinner
Copy link
Contributor

vstinner commented Sep 4, 2023

And because the stable ABI doesn’t even specify an expected lifetime in years or number of releases, it means that no changes are possible until we move to 4.x.

It would be nice to consider designing an abi4 to cleanup the dust: remove functions which are already removed in the API level. The abi3 has 61 symbols which are "ABI only": has been removed from the limited C API.

Well, there are also private symbols which are used by limited C API macros/functions which still exist. For example, the ABI-only variable _Py_NoneStruct is used to implement Py_None at the API level: #define Py_None (&_Py_NoneStruct).

These removals are related to different Python changes. Examples:

  • Only use Py_ssize_t in the PyArg_ParseTuple() API: PY_SSIZE_T_CLEAN macro. _PyArg_ParseTuple_SizeT() was removed, since PyArg_ParseTuple() now uses Py_ssize_t.
  • Removal of the legacy API to configuration the Python initialization: see issue [C API] PEP 741: Add PyInitConfig C API to customize the Python initialization python/cpython#107954, the limited C API has no replacement yet :-(
  • Removal of the legacy Python 3.2 API for Unicode strings: PEP 623 and [PEP 624](https://peps.python.org/pep-0624/]
  • API using FILE* were removed, like PyMarshal_ReadObjectFromString(): PEP 384 excluded FILE* from the limited C API.
  • PyEval_AcquireLock() and PyEval_ReleaseLock() functions were broken and have been replaced with existing PyEval_SaveThread() and PyEval_RestoreThread() functions: removed in Python 3.13.

I see that _Py_RefTotal is an ABI-only symbol added to support the Python debug build (Py_DEBUG), but I'm not sure that it was needed to expose it. Before 3.9, Python debug build didn't support the limited C API. In Python 3.10, Py_INCREF() and Py_DECREF() got support for the limited API but are implemented as opaque function calls in this case. Maybe the problem is about supporting the limited C API version 3.9 and older on a debug build. I forgot the complicated details.

ABI-only symbols (functions and variables) of the Python 3.13 stable ABI:

  • PyCFunction_Call()
  • PyEval_AcquireLock()
  • PyEval_CallFunction()
  • PyEval_CallMethod()
  • PyEval_CallObjectWithKeywords()
  • PyEval_InitThreads()
  • PyEval_ReleaseLock()
  • PyEval_ThreadsInitialized()
  • PyMarshal_ReadObjectFromString()
  • PyMarshal_WriteObjectToString()
  • PyObject_AsCharBuffer()
  • PyObject_AsReadBuffer()
  • PyObject_AsWriteBuffer()
  • PyObject_CheckReadBuffer()
  • PySys_AddWarnOption()
  • PySys_AddWarnOptionUnicode()
  • PySys_AddXOption()
  • PySys_HasWarnOptions()
  • PySys_SetArgv()
  • PySys_SetArgvEx()
  • PySys_SetPath()
  • PyThreadState_DeleteCurrent()
  • PyUnicode_GetSize()
  • PyUnicode_InternImmortal()
  • Py_GetArgcArgv()
  • Py_SetPath()
  • Py_SetProgramName()
  • Py_SetPythonHome()
  • _PyArg_ParseTupleAndKeywords_SizeT()
  • _PyArg_ParseTuple_SizeT()
  • _PyArg_Parse_SizeT()
  • _PyArg_VaParseTupleAndKeywords_SizeT()
  • _PyArg_VaParse_SizeT()
  • _PyErr_BadInternalCall()
  • _PyObject_CallFunction_SizeT()
  • _PyObject_CallMethod_SizeT()
  • _PyObject_GC_New()
  • _PyObject_GC_NewVar()
  • _PyObject_GC_Resize()
  • _PyObject_New()
  • _PyObject_NewVar()
  • _PyState_AddModule()
  • _PyThreadState_Init()
  • _PyThreadState_Prealloc()
  • _PyWeakref_CallableProxyType
  • _PyWeakref_ProxyType
  • _PyWeakref_RefType
  • _Py_BuildValue_SizeT()
  • _Py_CheckRecursiveCall()
  • _Py_Dealloc()
  • _Py_DecRef()
  • _Py_EllipsisObject
  • _Py_FalseStruct
  • _Py_IncRef()
  • _Py_NegativeRefcount()
  • _Py_NoneStruct
  • _Py_NotImplementedStruct
  • _Py_RefTotal
  • _Py_SwappedOp
  • _Py_TrueStruct
  • _Py_VaBuildValue_SizeT()

@wjakob
Copy link

wjakob commented Sep 4, 2023

@vstinner Instead of killing _PyObject_New (called by PyObject_New), may I propose that PyObject_New is made part of the stable ABI, perhaps as an inline function? The alternative (PyType_GenericAlloc) is comparably complex and seems mainly useful for fancy types that are registered with the garbage collector. It also zero-initializes the object memory, which is wasteful when the object is big and the caller will directly initialize it as the next step.

@gvanrossum
Copy link
Author

What does it mean for an inline function to be in the Stable ABI?

@wjakob
Copy link

wjakob commented Sep 4, 2023

Edit: I think I meant to say "Limited API". Which I interpret as: this will compile to something that is supported by the stable ABI. (As is happening right now, by calling _PyObject_New , which is part of the stable ABI).

@vstinner
Copy link
Contributor

vstinner commented Sep 4, 2023

Instead of killing _PyObject_New (called by PyObject_New)

What do you mean by "killing" it? _PyObject_New() is part of the stable ABI. There is no plan to remove it. It's just that it's declared as a "ABI-only" symbol, since it's not exposed at the API level. In the API, you use the #define PyObject_New(type, typeobj) ((type *)_PyObject_New(typeobj)) macro.

The PyObject_New() macro is part of the limited C API. It is declared in Include/objimpl.h, not in Include/cpython/objimpl.h which is excluded by the limited C API.


What does it mean for an inline function to be in the Stable ABI?

Static inline functions are part of the limited C API, but not part of the stable ABI: the code is copied into the built binary, it's not shipped by libpython. It's similar to macros.

Simplified implementation of Py_INCREF() of the limited C API (version 3.10 or newer):

static inline Py_ALWAYS_INLINE void Py_INCREF(PyObject *op)
{
    _Py_IncRef(op);
}

The compiler copies _Py_IncRef(op); code into your C extension, as the preprocessor would do with a macro, and then it's the _Py_IncRef() function which is called in practice in your binary. When Python loads your C extensions, the dynamic loader of your shared library (.so in Linux) checks if the current Python process has the _Py_IncRef symbol. If if does, you're good, you can call it :-)

In the past, C extensions were linked to libpython. That's no longer needed. The C extension just looks for symbols in the process which loads it. It just works :-) It solves issues depending on if Python is built with or without --enable-shared (if the python3 program itself is linked to libpython, or "contains" libpython: single binary without libpython).

The previous limited C API implementation of Py_INCREF() didn't call _Py_IncRef(), but was fully inlined: so a C extension would modify PyObject.ob_refcnt directly, and Py_DECREF() would call _Py_Dealloc(). Please don't look into the exact implementation of Py_INCREF() / Py_DECREF() in Python 3.13, it became very complicated ;-) It's good that it's now implemented as an opaque function in the limited C API version 3.12 and newer!

When Py_INCREF() and Py_DECREF() were fully initialized (were not implemented as opaque function calls), it was a headache to support debug build (Py_DEBUG) and the special --with-trace-refs (Py_TRACE_REFS). I just made the former ABI compatible in Python 3.13.

@encukou
Copy link
Contributor

encukou commented Sep 5, 2023

The PyObject_New() macro is part of the limited C API.

No, it is not. The limited API is defined by an explicit list, not just by what happens to be visible to the compiler. This is a very important feature: we avoid #34 for the limited subset of the API.

The limited API is defined in Misc/stable_abi.toml along with the stable ABI, and shows up in the list (and as notes) in the docs.


Static inline functions are part of the limited C API, but not part of the stable ABI: the code is copied into the built binary, it's not shipped by libpython. It's similar to macros.

Please avoid macros/inline functions in the limited API (except as optimizations). Such functions can only be used by C/C++, not even by ctypes. (FWIW, we even use ctypes to test if the symbols are available!)

@wjakob
Copy link

wjakob commented Sep 5, 2023

@vstinner -- I seem to have misunderstood your message a few posts up. Weren't you proposing an abi4 that removes ABI-only entry points with no corresponding API? And then you specifically listed many symbols including _PyObject_New. My message is simply a request to not remove accessibility of PyObject_New via _PyObject_New from stable ABI modules. If that wasn't your plan, then please do ignore my message.

@vstinner
Copy link
Contributor

vstinner commented Sep 5, 2023

I don't need abi4. I'm not aware of anyone actively pushing for this. PyObject_New() is not going away.

@vstinner
Copy link
Contributor

vstinner commented Sep 5, 2023

No, it is not. The limited API is defined by an explicit list, not just by what happens to be visible to the compiler. This is a very important feature: we avoid #34 for the limited subset of the API.

The limited API is defined in Misc/stable_abi.toml along with the stable ABI, and shows up in the list (and as notes) in the docs.

That's very confusing for me. For me, the limited C API is what is accessible by #include <Python.h> when Py_LIMITED_API macro is defined.

If PyObject_New() is not part of the limited C API, it should be removed when the macro is defined, no? What is the point of having _PyObject_New() in the stable ABI if PyObject_New() is not part of the limited C API?

@wjakob
Copy link

wjakob commented Sep 5, 2023

I don't need abi4. I'm not aware of anyone actively pushing for this. PyObject_New() is not going away.

Phew. Thank you for clarifying!

If PyObject_New() is not part of the limited C API, it should be removed when the macro is defined, no? What is the point of having _PyObject_New() in the stable ABI if PyObject_New() is not part of the limited C API?

Wait, what? That is exactly the thing I asked not to do. ;)

@vstinner
Copy link
Contributor

vstinner commented Sep 5, 2023

Please avoid macros/inline functions in the limited API (except as optimizations). Such functions can only be used by C/C++, not even by ctypes. (FWIW, we even use ctypes to test if the symbols are available!)

Well, Python has a long history, and C was the main target of the C API. Right, we can slowly convert macros and static inline functions to regular functions, but it's a slow incremental work since it can affect performance and may introduce surprising issues (like how macros can be badly abused in C, see PEP 674).

I got bitten by converting PyType_HasFeature() macro to an opaque function which made Python slower on macOS because Python wasn't built with LTO there (on Linux, there was no impact).

I wrote some articles on the topic:

Well, and https://peps.python.org/pep-0670/ obviously.

So far, what worked the best is to have an opaque function call in the limited C API, and override the name with a macro or static inline function in the non-limited C API.

@encukou
Copy link
Contributor

encukou commented Sep 7, 2023

That's very confusing for me. For me, the limited C API is what is accessible by #include <Python.h> when Py_LIMITED_API macro is defined.

That would be good to have.
But it can't always be that way -- for example, _Py_IncRef is not part of the limited API. So, the toml file is the source of truth.

If PyObject_New() is not part of the limited C API, it should be removed when the macro is defined, no? What is the point of having _PyObject_New() in the stable ABI if PyObject_New() is not part of the limited C API?

That, or it should be added to the limited API :)

@encukou
Copy link
Contributor

encukou commented Jan 15, 2025

Exposed struct fields (ob_refcnt, ob_type, ob_size) force either contortions on us or recompilations on users

Discussion for solving this: https://discuss.python.org/t/making-pyobject-opaque-in-the-limited-api/77206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants