-
Notifications
You must be signed in to change notification settings - Fork 544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prefer system install of UCX in pip devcontainers, update outdated RAPIDS references #6149
prefer system install of UCX in pip devcontainers, update outdated RAPIDS references #6149
Conversation
Contributes to rapidsai/build-planning#118 Proposes the following changes for pip devcontainers: * prefer system installation of ucx to the one provided by the `libucx-cu{11,12}` wheels (ref: rapidsai/devcontainers#421 (comment)) And some other related changes noticed while doing that: * update lingering `24.*` references to `25.02` ## Notes for Reviewers ### How I tested this Relying on CI for most things. Double-checked that `update-version.sh` would have caught the one lingering `24.12` reference like this: ```shell ./ci/release/update-version.sh '25.02.00' git grep -E '24\.' ``` Similar to rapidsai/cuml#6149 Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) URL: #501
I read rapidsai/build-planning#118 to find out why preferring the system library over the one from a wheel is "a good idea". But that issue doesn't really explain it, except it mentions that for most users who don't care about packaging the plan is to use the library from the wheel. Which makes me think "shouldn't we be doing that as devs as well?" - kinda dogfooding, kinda making sure this is the most comfortable path, etc I think the change itself is uncontroversial (the magic of using the env variable once it is set is presumably hidden away somewhere and tested?), so for me it is about understanding "why make this change". |
It looks like the failing tests are all related to dask. I know Dante and friends were talking about there being some issue with dask, so maybe we wait and see? Or someone who knows can chime in :D |
Thanks for the thorough review @betatim . Let me try to address these.
The main goal of rapidsai/build-planning#118 is to change our library-loading logic across RAPIDS such that if you installed wheels, at runtime by default you get the wheel-provided library, NOT any system-provided one that you might have laying around). Doing this makes it safer to use In RAPIDS devcontainers, we want to prefer the system installation because we're producing specific images for different UCX versions. See the
And for other reasons related to how the devcontainers are designed (see next response below).
Yes there would be some benefits to just using the But that design discussion is not specific just to UCX, and is bigger than this series of PRs. That longer-term discussion is happening here: rapidsai/build-planning#119
Yes. For the specific case of UCX (
|
Thanks a lot for taking the time to explain. Sounds reasonable to me. @dantegd any thoughts on the failing (dask) tests? Should we wait for that to get resolved before merging this? |
I just merged latest All tests and the docs builds are failing with this runtime error:
full stacktrace (click me)
Those jobs are getting
Seeing
That should be resolved by merging #6157, which is blocked by the previously-mentioned Dask issues. |
@jameslamb unfortunately there's still a devcontainer pip issue that we are trying to solve in #6170, but the cuml dask failures should be green now though |
/merge |
Thanks for coming back and updating this @bdice ! |
Contributes to rapidsai/build-planning#118
Proposes the following changes for pip devcontainers:
libucx-cu{11,12}
wheels (ref: prefer system installs of UCX libraries when using RAPIDS wheels devcontainers#421 (comment))And some other related changes noticed while doing that:
24.*
references to25.02
Notes for Reviewers
How I tested this
Relying on CI for most things. Double-checked that
update-version.sh
would have caught the one lingering24.12
reference like this: