Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

buildPythonPackage, buildPythonApplication: support __structuredAttrs = true #347194

Open
wants to merge 14 commits into
base: staging
Choose a base branch
from

Conversation

ShamrockLee
Copy link
Contributor

@ShamrockLee ShamrockLee commented Oct 8, 2024

This PR refactors the setup hooks and wrap.sh that construct the two build helpers, making buildPythonPackage and buildPythonApplication support both __structuredAttrs = true and structuredAttrs = false.

The representation of the flags attributes as shell/environment variables for most Python building setup hooks are now the same as stdenv.mkDerivation and other build helpers -- they are space-separated environment variables when __structuredAttrs = false and Bash arrays when __structuredAttrs = true, and are concatenated to the command without Bash-evaluation. The following behaviour changes are introduced during the conversion:

  • The following flags are no longer Bash-expanded before concatenated to the command:

    • disabledTests and disabledTestPaths for pytestCheckHook. (disabledTestPaths used to be expanded twice before concatenation.)
    • setupPyBuildFlags and setupPyGlobalFlags for setuptoolsBuildHook.
  • pytestFlags and unittestFlags replaces pytestFlagsArray and unittestFlagsArray and become the new and conforming interface.

  • pytestFlagsArray and unittestFlagsArray are kept for compatibility purposes. They continue to be Bash-expanded before concatenated. Such compatibility layer would get removed in future releases.

Almost all the refactored scripts are linted with ShellCheck, except wrap.sh, which I don't know how to lint correctly.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.11 Release Notes (or backporting 23.11 and 24.05 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@github-actions github-actions bot added 6.topic: python 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: documentation This PR adds or changes documentation 8.has: changelog labels Oct 8, 2024
@ShamrockLee ShamrockLee force-pushed the build-python-structured-attrs branch 3 times, most recently from f3bee93 to 46f8c8a Compare October 9, 2024 11:53
@ShamrockLee ShamrockLee marked this pull request as ready for review October 9, 2024 12:43
@ofborg ofborg bot added the 11.by: package-maintainer This PR was created by the maintainer of the package it changes label Oct 9, 2024
@ShamrockLee
Copy link
Contributor Author

Questions:

When passing makeWrapperArgs to buildPython* as an argument (attribute), it will be passed as an environment variable (containing a space-separated string) to the build script. However, people assume it would be a Bash array when changing it inside preFixup.

  • Should we adapt to it and transparently convert the values from the Nix derivation attribute to a Bash array as well, or should we fix the Nix expressions that treats them as arrays?
  • How should we fix the expressions mentioned above? Should we add __structuredAttrs = true to make makeWrapperArgs a Bash array, or should we use appendToVar to append elements to it? appendToVar inherently can't handle space-containing arguments. I'm unsure if adding __strucutredAttrs = true to a package expression would be considered backward-incompatible.

@wolfgangwalther wolfgangwalther self-requested a review October 9, 2024 15:22
@wolfgangwalther
Copy link
Contributor

Thanks for working on this! I intend to review this PR, but will not have much time the next two weeks.

@ShamrockLee ShamrockLee force-pushed the build-python-structured-attrs branch from 46f8c8a to 38943f3 Compare October 9, 2024 20:50
@ShamrockLee
Copy link
Contributor Author

@ofborg build python3Packages.six
@ofborg build python311Packages.tensorflow

Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a full review, but it should be enough for a first iteration.

Most importantly I have not touched the question on how to deal with the compatibility things, yet, aside from giving a few more generic thoughts on the matter.

Maybe we can split some of the uncontroversial changes off (everything independent of compatHelpers) and merge separately?

@ShamrockLee ShamrockLee force-pushed the build-python-structured-attrs branch from 01ac9f7 to ae62db1 Compare October 27, 2024 17:57
@ShamrockLee
Copy link
Contributor Author

This confuses me. Why are we still using +=" ..." string concatenation for Phases at all? It seems like all hooks use xxxHooks+=(...) array syntax already, so maybe we should just do the same for Phases separately? Seems to be only preFixupPhases in ~8 files right now.

There used to be a lot of *Phase+=" somePhase" before #339117. #339117 also targets *Phases+=(somePhase), but some were not caught up as they are more challenging to discover using regular expression search.

It seems like all hooks use xxxHooks+=(...) array syntax already, so maybe we should just do the same for Phases separately?

When __structuredAttrs = false, The preFixupPhases = [ "foo" "bar" ] will pass to the build script as an environment variable preFixupPhases="foo bar". When someone does preFixupPhases+=("baz"), it will be transparently converted to preFixupPhases+=("foo bar" "baz"). Such value is semantically wrong but tolerated by the setup.sh implementation as it is accessed as "${preFixupPhases[*]}".

We should not rely on such implementation detail of setup.sh, but handle the *Phases correctly with prependToVar and appendToVar.

@ShamrockLee
Copy link
Contributor Author

This is not a full review, but it should be enough for a first iteration.

Thank you for reviewing!

Most importantly I have not touched the question on how to deal with the compatibility things, yet, aside from giving a few more generic thoughts on the matter.

Maybe we can split some of the uncontroversial changes off (everything independent of compatHelpers) and merge separately?

I'll split the easier part into another PR after we reach a consensus about how to add elements to *Phases. Or should we discuss it in the new PR?

@wolfgangwalther
Copy link
Contributor

There used to be a lot of *Phase+=" somePhase" before #339117. #339117 also targets *Phases+=(somePhase), but some were not caught up as they are more challenging to discover using regular expression search.

I wasn't aware of that PR. I see that doing the same for the last remaining *Phases here makes sense. But in that case, this should be a commit doing all of the remaining ones together? (grep -r 'Phases+=' pkgs still shows me ~8 more on your branch)

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Oct 27, 2024

(grep -r 'Phases+=' pkgs still shows me ~8 more on your branch)

Oops! My bad.

I see that doing the same for the last remaining *Phases here makes sense. But in that case, this should be a commit doing all of the remaining ones together? (grep -r 'Phases+=' pkgs still shows me ~8 more on your branch)

I'll gather them into the same commit.

Update: Applied.

@ShamrockLee ShamrockLee force-pushed the build-python-structured-attrs branch 2 times, most recently from 21c6d04 to 9f8fa39 Compare October 27, 2024 19:05
@wolfgangwalther
Copy link
Contributor

wolfgangwalther commented Dec 22, 2024

The current pytestCheckHook implementation turns out to chek for the existence of each path inside disabledTestPaths and throw build-time errors if they doesn't exist, causing the globbed paths to fail.

should we drop the path existence check or add a new flag for the globs? If the latter, how should we name the new attribute?

Ah, interesting. How about we change the existence check to only throw an error if no glob is present?

So globbed paths go through unchecked, while unglobbed paths are checked.

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Dec 22, 2024

How about we change the existence check to only throw an error if no glob is present?

So globbed paths go through unchecked, while unglobbed paths are checked.

Sounds great!

However, it's challenging to differentiate globbed paths from unglobbed ones. Formally speaking, we are trying to detect if a glob matches and only matchs a path identical to itself when an infinite set of paths is tried.

@wolfgangwalther
Copy link
Contributor

However, it's challenging to differentiate globbed paths from unglobbed ones. Formally speaking, we are trying to detect if a glob matches and only matchs a path identical to itself when an infinite set of paths is tried.

Uhm, I thought we'd just check whether a literal * is present in the test path? If yes, then pytest will glob for it. If no, then we check existence.

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Dec 22, 2024

Searching across the PyTest source code, I found that it uses the fnmatch module from the Python standard library to match the globs against a path PyTest collects in former steps.

If we would like to, we could use the glob module from the Python standard library to check with Python if each glob matches some paths in CWD.

@ShamrockLee
Copy link
Contributor Author

ShamrockLee commented Dec 22, 2024

Uhm, I thought we'd just check whether a literal * is present in the test path? If yes, then pytest will glob for it. If no, then we check existence.

This implementation would assume:

  1. Users would only use * in the globs.
  2. The paths to ignore must not contain *, ?, [ or ], or the user will be forced to escape it with [<char>] and fail the path existence check.

@wolfgangwalther
Copy link
Contributor

If we would like to, we could use the glob module from the Python standard library to check with Python if each glob matches some paths in CWD.

That would be best, of course.

Add flag pytestFlags as the new, conforming interface
replacing pytestFlagsArray.

Stop Bash-expanding disabledTests and disabledTestPaths.

Handle disabledTestPaths with `pytest --ignore-glob <path>`
to keep globbing support.
Check if each path glob matches at least one path
using the `glob` module from the Python standard library.

Also make buildPythonPackage and buildPythonApplication
stop escaping the elements of disabledTests and disabledTestPaths.
Handle flags with appendToVar and concatTo.

Stop Bash-expanding elements of
setupPyGlobalFlags and setupPyBuildFlags.
…stically

Take unittestFlags as the new and conforming interface.

Keep unittestFlagsArray as is.
@ShamrockLee ShamrockLee force-pushed the build-python-structured-attrs branch from b213332 to d19e36e Compare December 22, 2024 19:23
@ShamrockLee
Copy link
Contributor Author

@ofborg build python3Packages.conda
@ofborg build python3Packages.tensorflow

@ShamrockLee ShamrockLee mentioned this pull request Dec 23, 2024
13 tasks
@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Dec 31, 2024
Copy link
Contributor

@wolfgangwalther wolfgangwalther left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did the following:

  • went through the PR line-by-line, except for a few nits the changes LGTM.
  • found no remaining uses of pytestFlagsArray or unittestFlagsArray in doc/*.
  • found no other setupPyGlobalFlags or setupPyBuildFlags depending on bash eval or passing spaces.
  • build some random python packages to confirm nothing is obviously broken.

- `disabledTests` and `disabledTestPaths` for `pytestCheckHook`. (`disabledTestPaths` used to be expanded twice before concatenation.)
- `setupPyBuildFlags` and `setupPyGlobalFlags` for `setuptoolsBuildHook`.

- `pytestFlags` and `unittestFlags` replaces `pytestFlagsArray` and `unittestFlagsArray` and become the new and conforming interface.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `pytestFlags` and `unittestFlags` replaces `pytestFlagsArray` and `unittestFlagsArray` and become the new and conforming interface.
- `pytestFlags` and `unittestFlags` replace `pytestFlagsArray` and `unittestFlagsArray` and become the new and conforming interface.


- `pytestFlags` and `unittestFlags` replaces `pytestFlagsArray` and `unittestFlagsArray` and become the new and conforming interface.

- `pytestFlagsArray` and `unittestFlagsArray` are kept for compatibility purposes. They continue to be Bash-expanded before concatenated. Such compatibility layer would get removed in future releases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- `pytestFlagsArray` and `unittestFlagsArray` are kept for compatibility purposes. They continue to be Bash-expanded before concatenated. Such compatibility layer would get removed in future releases.
- `pytestFlagsArray` and `unittestFlagsArray` are kept for compatibility purposes. They continue to be Bash-expanded before concatenated. This compatibility layer will be removed in future releases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge conflict somewhere here on that file.

Comment on lines +143 to +144
- `python3Packages.conda` now uses `__structuredAttrs = true`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we don't expect any user-facing changes, I don't think this needs to be mentioned.

Comment on lines +145 to +146
- `python3Packages.tensorflow` now uses `__structuredAttrs = true`.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might break user-side overriding via overrideAttrs. I'm unsure if we need to keep such interface stable anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any change to a derivation could do so, depending on what you do. I'd say breaking override deserves a note / deprecation etc, but breaking overrideAttrs does not.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tensorflow points at the binary package now, but might point at the source-based one by release. Might be better to pick another test package. Agreed that it doesn't need release note.

@wegank wegank added 12.approvals: 1 This PR was reviewed and approved by one reputable person and removed 12.approvals: 1 This PR was reviewed and approved by one reputable person labels Jan 6, 2025
@wolfgangwalther
Copy link
Contributor

FTR: I built conda and tensorflow successfully on x86_64-linux, when applying this set of patches on master. Let's fix the nits / docs / merge-conflict and merge this, I'd say.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 6.topic: python 8.has: changelog 8.has: documentation This PR adds or changes documentation 10.rebuild-darwin: 501+ 10.rebuild-darwin: 5001+ 10.rebuild-darwin-stdenv This PR causes stdenv to rebuild 10.rebuild-linux: 501+ 10.rebuild-linux: 5001+ 11.by: package-maintainer This PR was created by the maintainer of the package it changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants