Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migration of some metadata into PEP 621-compliant pyproject.toml #409

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

KOLANICH
Copy link
Contributor

@KOLANICH KOLANICH commented Nov 3, 2022

No description provided.

@eli-schwartz
Copy link
Contributor

Is there any compelling advantage to switching from cfg to toml?

Particularly, the toml support is relatively new, so that would make it more complex to use in Linux distros that don't yet have that version of setuptools packaged. It seems like quite a bit of trouble for no real gain IMO.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Nov 4, 2022

so that would make it more complex to use in Linux distros that don't yet have that version of setuptools packaged.

It's their problem. If they package a new version of PGPy, they can probably package a new version of setuptools (I hope they have the packaging automated, so packaging a new version of setuptools should be as easy as triggering a build pipeline (if it is properly implemented and relies on setuptools, pip and PEP 515 frontends to do the most of work on building a wheel and unpacking it into distro-specific locations, and the native package builders just consume the results of pip working. If they have to write unique recepies for each package doing manually what setuptools and pip do automatically, than they should fix their python package building tools first))

It seems like quite a bit of trouble for no real gain IMO.

You are right. Updating mature libraries to "new shiny shit" is always a bit of trouble for no real gain. So feel free to postpone merging of this PR untill you perceive the troubles associated with it have decreased enough.

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Nov 4, 2022

It's their problem. If they package a new version of PGPy, they can probably package a new version of setuptools

TBH I was thinking more of the users of those distros, than the developers of those distros.

If they have to write unique recepies for each package doing manually what setuptools and pip do automatically, than they should fix their python package building tools first))

Although this also seems clueless. You do realize that "manually" running python setup.py bdist_wheel && python -m installer dist/*.whl is not any more "manual" than python -m build -nx && python -m installer dist/*.whl and that both are manual and fiddly compared to python setup.py install?

python -m build is even the official PyPA recommendation, and it will use whichever version of setuptools is previously installed on the system. Build isolation is a (disputed) feature, not a requirement. It doesn't play well with several use cases, including hermetic builds.

...

The thing is, that PyPA have made several good arguments in favor of PEP 517/518 (and several bad ones!) but none of this really has to do with "unique recipes" and the avoidance thereof. So I'm not sure where you got this from.

@KOLANICH
Copy link
Contributor Author

KOLANICH commented Nov 4, 2022

TBH I was thinking more of the users of those distros, than the developers of those distros.

For the users it should not be a problem at all if Python version shipped in their distro is supported. They have to just use pip. It's package builders have the most of troubles to integrate pip properly into their tools for package build automation.

python setup.py install

  1. Running it from root (sudo) has no chance to be secure enough. sudo pip3 install --upgrade ./*.whl should be secure enough (if the wheel is not rogue (
    a. a wheel can replace files of other packages, giving privelege escalation when anything calling those package is executed from root; executing python-based tools from root is quite common
    b. a wheel can contain data files and currently I don't know if they can go into a dir they are intended to go
    c. fortunately, wheels are zip-based, so cannot contain symlinks and device node
    ) and the build and install systems are properly tightened, that should be a long term direction)
  2. if I remember right, setup.py install is a bit different than installing a prebuilt wheel using pip.

You do realize that "manually" running python setup.py bdist_wheel && python -m installer dist/*.whl is not any more "manual" than python -m build -nx && python -m installer dist/*.whl and that both are manual and fiddly compared to python setup.py install?

  1. First of all, the discussion about simplicity of use has nothing to do with the discussion about the need to upgrade build tools, setup.py could have been removed long ago.
  2. Yes, I realize that from user's PoV it is easier to just call sudo pip install --upgrade PGPy and not to worry about which code pip will pull out of pypi and git repos in the form of setup.py.
  3. sudo python3 ./setup.py install has no chance to be secure. 2 stage process, when building is done from a non-privileged user, and installation is done from a privileged one, is more secure.
  4. Declarative config without any code execution from the package itself can make the wheel build process more secure, if the build tools are hardened the way to prevent execution of the code. Currently people can specify an own build backend and its plugins within pyproject.toml, and they will be downloaded (from pypi) and run automatically. It is bad, instead PEP 517 frontends should use the packages already installed within the system (by installation a package with the build tool the operator of the system where a package is built explicitly consents to run the build tool specified in pyproject.toml).

Build isolation is a (disputed) feature, not a requirement.

That's why I have to add -n into build command line arguments - I don't want anything be pulled out of pypi and I want build use the versions of tools already installed in my systems, no matter what is written in pyproject.toml (there are too lot of devs who like to use < and == and ~= conditions out of fear "what if there will be a breaking change in the API? I won't let them to break my package, I'd better break all its future versions (that will likely be compatible) myself!", I had to even create a tool undoing this kind of sabotage).

The thing is, that PyPA have made several good arguments in favor of PEP 517/518 (and several bad ones!) but none of this really has to do with "unique recipes" and the avoidance thereof. So I'm not sure where you got this from.

The last time I have tried to integrate installation of wheels into my metabuild system (BTW my interest into this lib was driven basically by the need to replace gpgme in the component of it responsible for checking source archive integrity) for native packages there were no packages like installer, and pip used to have no argument to specify the dir where a wheel should be installed, so I hadn't implemented building of python packages in it that time. By now it seems it has changed.

@Commod0re
Copy link
Contributor

We actually host several package builds for PGPy in this repo.

if you take a look at these branches:

  • archlinux
  • gentoo
  • debian/master

you’ll find that pip should not be needed to build PGPy into a distribution package at all anyway. Library dependencies at the OS level should be packaged separately and individually so there’s no need to resolve dependencies using pip.

@Commod0re
Copy link
Contributor

Commod0re commented Nov 11, 2022

Outside of that case, most of the other issues you raise here can be mitigated by using a virtualenv instead of mixing pip and os managed packages. Using virtualenvs avoids a host of potential issues, and at least as of right now it is the recommended way to install packages with pip, including this one

@KOLANICH
Copy link
Contributor Author

you’ll find that pip should not be needed to build PGPy into a distribution package at all anyway.

I looked into them qnd found that it is some what I meant under "non-automated". For example the list of python- dependencies for an Arch package is hardcoded, instead they can be fetched from a wheel file using distlib. python setup.py install is a legacy way that should not be used, for example because the build backend can be not setuptools or because there can be no setup.py. The right way should be using installer package (maybe with conjunction with pip as a temporary workaround to pypa/installer#145).

Library dependencies at the OS level should be packaged separately and individually so there’s no need to resolve dependencies using pip.

I'd argue that it is the current approach, but IMHO is not the correct one in the long term, so IMHO the verb should is incorrect here. IMHO the right way is to integrate package managers with each other, allowing them to specify dependent packages in other managers. Because currently for example for Debian there are 2 sources of truth for python packages, apt&dpkg and pip. One can install a package from pip, this way overwriting the packages installed with dpkg, which will cause debsums go invalid. And again, Debian is against shiny new shit, they prefer old fossilized one. So we have to reinstall new versions of packages via pip.

But I recognize that it is the current approach, so some tooljng should be created for it. But the source of truth should be not the recepies for building packages for package managers, but the native package of Python ecosystem - a wheel.

can be mitigated by using a virtualenv

venvs are prescription drugs and using them where they are not really needed (I know only one use case where usage of venvs is justified: testing software in controlled environments without uninstalling/downgrading packages in the main system) is just hiding dirt and garbage under a carpet and it can become addictive to "solve" the problems this way. The same with so called "self-sufficient containers" usage for distributing apps: snap, flatpak and sometimes Docker. I'm not the only person thinking like this. For example in https://habr.com/ru/post/433052/ (it's in Russian, but you can try to use machine translation tools. Though the points written there are obvious, it is nice that @amarao has written this, so a link to it can be given) arguments against self-sufficient containers have been given.

@eli-schwartz
Copy link
Contributor

I looked into them qnd found that it is some what I meant under "non-automated". For example the list of python- dependencies for an Arch package is hardcoded, instead they can be fetched from a wheel file using distlib.

As a matter of curiosity, what experience, if any, do you have in distro tooling?

I ask because it's generally accepted among all distros that you need to know the dependencies in order to set up the build, before you download the source code.

You can't have one build dependency that gets run as a program to figure out the other build dependencies.

That doesn't mean there aren't tools to automatically transcribe the build dependencies from a pyproject.toml into a debian control file or Gentoo ebuild or Arch PKGBUILD or void template or rpm spec or what have you -- because those tools definitely do exist, and you may even be unknowingly looking at the output of one.

Think of it as a lockfile in a different format.

@KOLANICH
Copy link
Contributor Author

As a matter of curiosity, what experience, if any, do you have in distro tooling?

I stay away of it as much as possible. I prefer using universal tooling like CPack.

's generally accepted among all distros that you need to know the dependencies in order to set up the build, before you download the source code.

It feels just incorrect. It not the one who makes a package decides which deps a package must have. It is the one who writes software decides which deps his software uses and which workarounds he can apply to support legacy versions of deps, and whether it is in runtime or in compile time. If there are no deps in the distro, there are 3 options: find a repo with them and fetch them from it (for example dl.winehq.org, apt.llvm.org, apt.kitware.com, packages.microsoft.com, download.mono-project.com) and instruct end users to do the same, package them and setup an own repo and vendor them. Vendoring is hiding dirt under carpet...

You can't have one build dependency that gets run as a program to figure out the other build dependencies.

If the build tools used in a distro are highly inconvenient unusable piece of s...oftware, this means we have to use own build tools (it doesn't exclude using distro-specific build tools, we can wrap them). This is the idea on how prebuilder should work: ideally one should just specify an upstream git repo, build system and maybe some metadata and get the proper package. Maybe without man files, if the upstream doesn't ship them. Maybe without distro-specific fixes (though it is possible to provide patch files, I think distro-specific fixes for major distros should be maintained within upstream, not within a set of patch files that will turn into useless garbage because of merge conflicts the only existing working way to fix which is manually rewriting the changes from scratch taking into account new code structure).

Think of it as a lockfile in a different format.

== and < locks are yet another example of hiding dirt under carpet, they shouldn't be used.

@Commod0re
Copy link
Contributor

I'd argue that it is the current approach, but IMHO is not the correct one in the long term

Conventional usage trumps feelings. Conventions are not conventional for no reason.

Because currently for example for Debian there are 2 sources of truth for python packages, apt&dpkg and pip.

It seems like you have some fundamental misunderstandings here. pip is not a source of truth for system level packages and it never should have been. The sole source of truth for system level packages in a debian-derived system is dpkg and dpkg alone. pip is therefore an outsider and shouldn't be touching anything at the system level.

It feels just incorrect.

To you.

It not the one who makes a package decides which deps a package must have.

Package maintainers don't "decide" anything here, they either understand how to make it work, i.e. supplying the correct dependencies specified by the software maintainers, or they don't, and they build a broken package.

I think distro-specific fixes for major distros should be maintained within upstream

Why should it be my responsibility to ensure that my software runs on some random distro I neither use nor intentionally support? The reason distribution package maintainers are responsible for maintaining distro-specific packages is because the vast majority of distro-specific incompatibilities are caused by idiosyncracies created by those very same maintainers. It's their responsibility to manage that, not mine.

Do these patch stacks get misused? Does it cause problems occasionally? Yes. But most packages don't need that kind of work, and honestly it's a shitty reason to think you know better than the debian repo maintainers, especially when you freely admit that not only have you never done that type of work, but you actively avoid it. If you never do something, your opinions on its intricacies are therefore necessarily uninformed and not valuable.

But that's all beside the point: distro-specific patches aren't needed for this package and chances are they never will be, so railing against that of all things is highly suspect as a strawman argument.

Moved all the "requirements*.txt" into `pyproject.toml` under extras.
@KOLANICH
Copy link
Contributor Author

Why should it be my responsibility to ensure that my software runs on some random distro I neither use nor intentionally support? The reason distribution package maintainers are responsible for maintaining distro-specific packages is because the vast majority of distro-specific incompatibilities are caused by idiosyncracies created by those very same maintainers. It's their responsibility to manage that, not mine.

There are 3 options resulting into working packages of latest versions:

  1. the idiosyncracies are needed, unavoidable and expected. In this case one creates within own package the extensions points for the distro, and the distro maintains the plugin for these extension points itself.
  2. the idiosyncracies have no point and should are got rid of.
  3. the upstream bears the burden of the idiosyncracies.

There is another option, resulting in broken software: the upstream says it is not his responsibility, distro maintainers say that they don't want to do their job of updating "shiny new shit" and that they don't ove anyone anything and that the idiosyncracies are needed and they won't get rid of them. Noone is responsible for the software being working (and not the latest version is considered "unworking" and "broken", because not the latest version doesn't have the latest features, and not having a feature is one of the cases of a broken feature).

they either understand how to make it work, i.e. supplying the correct dependencies specified by the software maintainers, or they don't, and they build a broken package.

This info is encoded in metadata files shipped by upstream. If they don't understand how to make use of these files properly, they can build a broken package.

Why should it be my responsibility to ensure that my software runs on some random distro I neither use nor intentionally support?

Because if your software doesn't support a target platform, this means your software can be unusable there. Of course one can ignore any platform one wants. Even a big one. Even Micro$oft Windows. Even Googlag Chrome (I do it in my WebExtensions, if one uses Chrome, he is worthless). But not supporting a platform means that it is likely the software won't be used there. Usually one wants to support as many platforms as possible. And as many distros as possible. It is usually not distro who wants to get your software as a package. It is you who wants the distro have a package for your software. Distros usually don't care of the most of software and the most of "shiny new shit". If they care, it is usually because they need it. For example to deliver updates for the distro itself. If they need a feature from a new version of a lib, then there will be a package. If they don't, they can choose save man-machine-hours and do nothing.

The only escape from this nasty situation is hardcorely automating build processes in order to keep people out of the process as much as possible. Ideally the machinery should be capable to provide packages for working latest versions of software for decades without any human modification of recepies on distro side.

@KOLANICH
Copy link
Contributor Author

But that's all beside the point: distro-specific patches aren't needed for this package and chances are they never will be

It is good.

is highly suspect as a strawman argument.

The argument were.

that would make it more complex to use in Linux distros that don't yet have that version of setuptools packaged.

It is distro responsibility to keep packages within the distro updated to the latest versions. It shouldn't be a burden to them to update them even from each commit if they automate the process properly. If they don't, they should blame themselves.

@eli-schwartz
Copy link
Contributor

I'm slightly confused why this has now become a discussion about how "hardcore" the project developers are?

This info is encoded in metadata files shipped by upstream. If they don't understand how to make use of these files properly, they can build a broken package.

Why do you keep assuming that people don't know how to make use of these files?

These files are format-shifted from a python API function call into the native format of the meta-build system. The process works, is automatable, and doesn't have any of the problems you seem to imply.

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Nov 24, 2022

It is distro responsibility to keep packages within the distro updated to the latest versions. It shouldn't be a burden to them to update them even from each commit if they automate the process properly. If they don't, they should blame themselves.

Automation without human oversight is indescribably foolish. We're discussing setuptools, so surely setuptools can serve as the best example here -- well, guess what, setuptools changes a lot these days, and also keeps breaking, then getting fixed. Have you been watching recent discussions around the future of distutils? Know anything about numpy.distutils? The scientific ecosystem relies rather heavily on that, and setuptools 60+ is outright unsupported and won't be getting support. It's in deep maintenance mode, while alternative PEP 517 build backends are explored.

So yeah, various groups -- maybe distros, but not only distros -- have maximum versions of setuptools pinned, because automatically using the latest version is actually proven to be known broken. This is also why stable distros like Debian exist, and don't upgrade anything, not even pgpy -- but users might upgrade pgpy while still using the system setuptools.

I do not understand why you're so dismissive of this entire line of thought that you build mysterious strawman arguments blaming everything imaginable for a multitude of shortcomings, many of which weren't mentioned before.

Be that as it may, the purpose of a strawman argument is to set up a fake argument that you can then tear down, but the fake argument you made up turns out to be right. I'm not sure this strawman argument is fulfilling its goal.

@KOLANICH
Copy link
Contributor Author

Automation without human oversight is indescribably foolish.

Maybe. What can happen? A lot of things. For example incompatible API changes, a backdoor, a logical bomb and so on. But the reality is maintainers cannot really properly audit every version, even if the packages are small, there is a lot of them. So the maximum what the maintainers usually have capacity to do is to read the changelog, to build the new version and to check if it works for them (often running the test suite takes too a lot of time).

well, guess what, setuptools changes a lot these days, and also keeps breaking, then getting fixed.

I know and it is good. PEP 621 was the most awaited change in setuptools for me. Ability to reduce of the mess worth a lot.

Know anything about numpy.distutils?

I know that cexts have always been untidy mess full of custom-written setup.py. I have no idea how to clean this mess properly other than creating a proper build system with declarative configs and the library of plugins and batteries matching the capabilities of CMake. Unfortunately it is an extremily large and complex task, even to design it properly is a large and complex task, but it has to be done to solve the issue properly. Google has started with its boilerplate framework and soong built on top of it, but it is far from being a match to CMake and soong is focused on needs of Android.

The scientific ecosystem relies rather heavily on that, and setuptools 60+ is outright unsupported and won't be getting support.

They depend on setuptools and they don't update their projects build scripts to work with the latest version of setuptools. So they can only blame themselves. Everything locked in past has no future and that's why is garbage. I'm sorry for being harsh, I'm also locked in past somewhen (for example I use Ubuntu 21.10 because Canonical got crazy and stopepd producing native Firefox packages in order to prce people to use snap in order to give snap and snapcraft more momentum, I know I have no other choice but to migrate to other distro, but for now I'm tok busy for that. For example I use 5.11 kernel because every newer version of the kernel has a bug that causes amdgpu driver to malfunction on the my laptop. The only fix is to either fix the driver, or to throw away the laptop and buy and use something that is used by Linus Torwalds or some else influential person who decides which changes to land, because it seems that the only really supported hardware configurations for Linux are the ones used by its decision-makers who are dogfooding), and I know it is unjust demand people to upgrade the deps, but life is inherently unjust, if someone cannot deal with it somehow, he will be deprived of the essential resources (not only software updates and Internet access) and will degrade and eventually die.

So yeah, various groups -- maybe distros, but not only distros -- have maximum versions of setuptools pinned, because automatically using the latest version is actually proven to be known broken.

Instead of pinning the versions they should fix their software to work with the latest versions. Or, if it is setuptools that broken, fix setuptools. The broken component should be fixed, not everyone forced to use rotten old shit only because some dev is lazy/out of capacity to fix his software. If he cannot do it, it just means his software is unmaintained.

This is also why stable distros like Debian exist, and don't upgrade anything, not even pgpy -- but users might upgrade pgpy while still using the system setuptools.

Stable (in that sense) software is useless. If a user needs the software with new features, sometimes he has to build only that software version from source. But the new version often requires new versions of libs, and new versions of those libs can require new versions of other libs, and to build those libs user can have to install new versions of build tools ... So either a user finds a side repo with new packages, or makes an own repo for the missing ones, or he has to change the distro to a more practical one. Maybe not as "stable", but at least actually solving user's problem instead of making excuses about why it cannot solve user's problem.

@eli-schwartz
Copy link
Contributor

I know that cexts have always been untidy mess full of custom-written setup.py. I have no idea how to clean this mess properly other than creating a proper build system with declarative configs and the library of plugins and batteries matching the capabilities of CMake. Unfortunately it is an extremily large and complex task, even to design it properly is a large and complex task, but it has to be done to solve the issue properly.

They depend on setuptools and they don't update their projects build scripts to work with the latest version of setuptools. So they can only blame themselves. Everything locked in past has no future and that's why is garbage. I'm sorry for being harsh

Well no, I already said, they are migrating to non-setuptools build systems, which is to say, Meson.

I don't understand why you think that the numpy ecosystem is to blame however. They didn't break a stable API. They are migrating to something with a robust long-term future. It's not their fault that they are stuck between a rock and a hard place, and it's not their fault that the only maintenance fix is to pin their dependencies while building out new infrastructure from scratch in the development branch. That doesn't mean it didn't happen regardless of "fault".

And here you are talking about how no matter what, it's always the fault of anyone that doesn't use the latest version of everything, because if they don't use the latest version then their project is "garbage". And this is supposed to be the life advice for why projects should make changes such as this PR that don't add any new functionality, only drop support for old versions?

Is that it, then? If pgpy supports versions of setuptools older than the very latest release, that inherently makes pgpy "garbage" because it isn't "hardcore" enough?

@KOLANICH
Copy link
Contributor Author

Well no, I already said, they are migrating to non-setuptools build systems, which is to say, Meson.

It is a right thing to do (though I prefer CMake for their rich box of batteries).

I don't understand why you think that the numpy ecosystem is to blame however. They didn't break a stable API.

stable => not developing => for the most of cases stagnating.

They are migrating to something with a robust long-term future.

It's good.

It's not their fault that they are stuck between a rock and a hard place, and it's not their fault that the only maintenance fix is to pin their dependencies while building out new infrastructure from scratch in the development branch.

It is their fault they have not addressed it already. We all have faults. But for them I guess the quickest fix possible is just forking setuptools and reverting the breaking commits, keeping the rest of improvements, and using an own fork as a build backend. It'd immediately allow vanilla setuptools and modified one coexist and their packages build be temporarily unbroken.
Then modifying an own fork to get rid of setuptools code, reusing the code from the original setuptools and where modifications are 100% needed - modyfying via monkey-patching. Copy8ng the code is the worst measure and should be used where all other have failed.

And here you are talking about how no matter what, it's always the fault of anyone that doesn't use the latest version of everything, because if they don't use the latest version then their project is "garbage".

It is the fault if one is incompatibke with the latest version. Currently PGPy is incompatible with the latest version of packaging practices within Python ecosystem, this PR fixes it.

And this is supposed to be the life advice for why projects should make changes such as this PR that don't add any new functionality, only drop support for old versions?

The new functionality is not so for this project itself. The functionality a project provides is always for third parties. I consider "can be parsed by the tools supporting only PEP 621 rather than the zoo of formats various buikd systems" as a feature. And I prefer the rest of formats for specifying metadata for python packages to be used in sdists to be extinct. If only dead and unmaintained packages use the formats that are nit PE; 621, then I can drop all the formats that are not PEP 621 in my tools and state that in this case they should be converted to PEP 621.

If pgpy supports versions of setuptools older than the very latest release, that inherently makes pgpy "garbage" because it isn't "hardcore" enough?

It is not about versions of setuptools TBH. If someone makes a plugin for old setuptools reading PEP 621, it would be OK too. But I guess it makes no sense, interested parties prefer to just upgrade setuptools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants