Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent platform support for taking the loaded libpython path into account in getpath #127970

Closed
FFY00 opened this issue Dec 15, 2024 · 4 comments
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@FFY00
Copy link
Member

FFY00 commented Dec 15, 2024

Bug report

Bug description:

The current getpath.py code tries determining base_prefix/base_exec_prefix by searching the location of the libpython library loaded in the current process, falling back to the location of the Python interpreter executable.

cpython/Modules/getpath.py

Lines 559 to 594 in 7900a85

# First try to detect prefix by looking alongside our runtime library, if known
if library and not prefix:
library_dir = dirname(library)
if ZIP_LANDMARK:
if os_name == 'nt':
# QUIRK: Windows does not search up for ZIP file
if isfile(joinpath(library_dir, ZIP_LANDMARK)):
prefix = library_dir
else:
prefix = search_up(library_dir, ZIP_LANDMARK)
if STDLIB_SUBDIR and STDLIB_LANDMARKS and not prefix:
if any(isfile(joinpath(library_dir, f)) for f in STDLIB_LANDMARKS):
prefix = library_dir
if not stdlib_dir_was_set_in_config:
stdlib_dir = joinpath(prefix, STDLIB_SUBDIR)
# Detect prefix by looking for zip file
if ZIP_LANDMARK and executable_dir and not prefix:
if os_name == 'nt':
# QUIRK: Windows does not search up for ZIP file
if isfile(joinpath(executable_dir, ZIP_LANDMARK)):
prefix = executable_dir
else:
prefix = search_up(executable_dir, ZIP_LANDMARK)
if prefix and not stdlib_dir_was_set_in_config:
stdlib_dir = joinpath(prefix, STDLIB_SUBDIR)
if not isdir(stdlib_dir):
stdlib_dir = None
# Detect prefix by searching from our executable location for the stdlib_dir
if STDLIB_SUBDIR and STDLIB_LANDMARKS and executable_dir and not prefix:
prefix = search_up(executable_dir, *STDLIB_LANDMARKS)
if prefix and not stdlib_dir:
stdlib_dir = joinpath(prefix, STDLIB_SUBDIR)

Looking at the location of the libpython library in use first makes sense, as that is more reliable than looking at interpreter location — it works when embedding, where there isn't any interpreter executable, it works when the executable is not on base_prefix, etc. However, this is only currently supported on Windows and macOS framework builds.

cpython/Modules/getpath.c

Lines 802 to 837 in 7b8bd3b

/* Add the runtime library's path to the dict */
static int
library_to_dict(PyObject *dict, const char *key)
{
#ifdef MS_WINDOWS
#ifdef Py_ENABLE_SHARED
extern HMODULE PyWin_DLLhModule;
if (PyWin_DLLhModule) {
return winmodule_to_dict(dict, key, PyWin_DLLhModule);
}
#endif
#elif defined(WITH_NEXT_FRAMEWORK)
static char modPath[MAXPATHLEN + 1];
static int modPathInitialized = -1;
if (modPathInitialized < 0) {
modPathInitialized = 0;
/* On Mac OS X we have a special case if we're running from a framework.
This is because the python home should be set relative to the library,
which is in the framework, not relative to the executable, which may
be outside of the framework. Except when we're in the build
directory... */
Dl_info pythonInfo;
if (dladdr(&Py_Initialize, &pythonInfo)) {
if (pythonInfo.dli_fname) {
strncpy(modPath, pythonInfo.dli_fname, MAXPATHLEN);
modPathInitialized = 1;
}
}
}
if (modPathInitialized > 0) {
return decode_to_dict(dict, key, modPath);
}
#endif
return PyDict_SetItemString(dict, key, Py_None) == 0;
}

The spotty platform support stroke me as odd, especially on macOS, as I see no apparent reason for only supporting framework builds, so I looked traced back the origin of this code.

The macOS logic goes back to Python 2.0, having been introduced in 54ecc3d. At this time, we were determining base_prefix/base_exec_prefix based on the Python interpreter location, which was problematic on OS X Frameworks, as the Python interpreter is provided via a launcher. The comment traces back to 55070f5 and is unrelated to the change made by that commit, it just highlights the special case for macOS framework builds.

In GH-29041, which introduced getpath.py, rewriting the old path initialization C code in Python, the logic changed to purposefully search the libpython location before the executable location, also adding Windows support. I imagine the existing macOS code was kept as-is as a mistake, leaving it behind the WITH_NEXT_FRAMEWORK flag, maybe under the assumption it was needed for some reason.

Considering the clear intent in the code, I am treating this a bug.

cc @zooba

CPython versions tested on:

CPython main branch

Operating systems tested on:

No response

Linked PRs

@FFY00 FFY00 added type-bug An unexpected behavior, bug, or error interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Dec 15, 2024
@picnixz picnixz added the extension-modules C modules in the Modules dir label Dec 15, 2024
FFY00 added a commit to FFY00/cpython that referenced this issue Dec 15, 2024
FFY00 added a commit to FFY00/cpython that referenced this issue Dec 15, 2024
FFY00 added a commit to FFY00/cpython that referenced this issue Dec 15, 2024
@zooba
Copy link
Member

zooba commented Dec 16, 2024

Everything in the initial version of getpath.py was intended to be a direct combination of getpath.c and getpathp.c at that time. I was trying to change absolutely nothing about the semantics on any platform at all, so that we could preserve all existing behaviour and then see changes over time in the Python version of the code as we bring them into sensible alignment.

Chances are, whatever changes you make here should be treated as new features (i.e. not backported). Unless someone is reporting an issue that started with the original changeover to getpath.py, any changes we make will break/upset people more than they help.

Other than that, go for it. Just remember that 100% of the quirks in this file are relied upon by someone, and any changes will break them. Make sure they're clearly documented (including whatsnew, porting guide, etc.)

@FFY00
Copy link
Member Author

FFY00 commented Dec 17, 2024

Everything in the initial version of getpath.py was intended to be a direct combination of getpath.c and getpathp.c at that time. I was trying to change absolutely nothing about the semantics on any platform at all, so that we could preserve all existing behaviour and then see changes over time in the Python version of the code as we bring them into sensible alignment.

Thanks for the clarification! I assumed the semantic change was deliberate, as this comment from the new code explicitly calls for it:

cpython/Modules/getpath.py

Lines 574 to 588 in b9a492b

# First try to detect prefix by looking alongside our runtime library, if known
if library and not prefix:
library_dir = dirname(library)
if ZIP_LANDMARK:
if os_name == 'nt':
# QUIRK: Windows does not search up for ZIP file
if isfile(joinpath(library_dir, ZIP_LANDMARK)):
prefix = library_dir
else:
prefix = search_up(library_dir, ZIP_LANDMARK)
if STDLIB_SUBDIR and STDLIB_LANDMARKS and not prefix:
if any(isfile(joinpath(library_dir, f)) for f in STDLIB_LANDMARKS):
prefix = library_dir
if not stdlib_dir_was_set_in_config:
stdlib_dir = joinpath(prefix, STDLIB_SUBDIR)

And the PR also added Windows support for this:

cpython/Modules/getpath.c

Lines 806 to 812 in b9a492b

#ifdef MS_WINDOWS
#ifdef Py_ENABLE_SHARED
extern HMODULE PyWin_DLLhModule;
if (PyWin_DLLhModule) {
return winmodule_to_dict(dict, key, PyWin_DLLhModule);
}
#endif

Though, given the complexity of the issue, I imagine this might have been something you meant to do as a follow-up, but ended up slipping into that PR.

Chances are, whatever changes you make here should be treated as new features (i.e. not backported). Unless someone is reporting an issue that started with the original changeover to getpath.py, any changes we make will break/upset people more than they help.

Considering this, yeah, we should probably treat this as a new feature.

Other than that, go for it. Just remember that 100% of the quirks in this file are relied upon by someone, and any changes will break them. Make sure they're clearly documented (including whatsnew, porting guide, etc.)

Right. In this particular scenario, I think there'll be a fairly low negative impact on users, the opposite, this will remove the necessity for workarounds such as the one discussed in GH-66409. Users relying on the current code behaving differently than in the proposed change are relying on the stdlib from a different Python installation being used, which is already fragile and unstable on its own.
But yes, I agree that this should be clearly documented 👍

@FFY00 FFY00 added type-feature A feature request or enhancement and removed type-bug An unexpected behavior, bug, or error labels Dec 17, 2024
@zooba
Copy link
Member

zooba commented Dec 17, 2024

It came from getpathp.c originally, which was significantly different from the old getpath.c. Windows has always1 allowed resolving from the DLL, because that's how you make embedding work reliably.

There's very little value in reading the diff from the old getpath.c to the current getpath.c - it's essentially a complete rewrite, but Git won't treat it like that. To compare with the old semantics, find the commit before getpath.py was added and copy out getpath.c and getpathp.c for reference.

Footnotes

  1. Well, considerably before getpath.py was written.

@FFY00
Copy link
Member Author

FFY00 commented Jan 8, 2025

Implemented in GH-127972.

@FFY00 FFY00 closed this as completed Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
extension-modules C modules in the Modules dir interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants