Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CondaPkg + PythonCall does not behave nicely on read-only filesystems #142

Open
kleinschmidt opened this issue Aug 13, 2024 · 3 comments
Open

Comments

@kleinschmidt
Copy link

kleinschmidt commented Aug 13, 2024

We use read-only filesystems in our docker containers deployed in k8s as a security measure (required by internal policies). I think this would probalby be fine on its own (we bake our python dependencies in at build time), but when combined with PythonCall we run into trouble. I think the root of it is that PythonCall calls envdir to get the location of the CondaPkg-managed executable, here:

https://github.com/JuliaPy/PythonCall.jl/blob/379f16c43933b5a7eed505adcdb70138a09c6b34/src/C/context.jl#L53-L71

That in turn calls resolve (not mentioned in the docstring!) which in turn creates a pidlock file with no way to disable it or control where it gets written (other than always including CondaPkg as a top-level dependency in every project we containerize this way, even if it's just a PythonCall-using package many layers deep in the stack that needs it):

CondaPkg.jl/src/resolve.jl

Lines 527 to 532 in 0c84aac

lock = try
Pidfile.mkpidlock(lock_file; wait = false)
catch
@info "CondaPkg: Waiting for lock to be freed. You may delete this file if no other process is resolving." lock_file
Pidfile.mkpidlock(lock_file; wait = true)
end

I'd hoped that setting offline mode would disable this kinda stuff but, no dice...that check doesn't get tripped until after the lockfile has been acquired:

dry_run |= offline()

I'm not totally sure this is an issue with CondaPkg per se, but I can think of a few things that CondaPkg might be able to do to play more nicely with read-only filesystems.

  1. Allow the meta_dir location to be controlled by a preference (then you could use a writeable volume mount in k8s)
  2. Disable the pidlock file in offline mode if no writes are going to take place
  3. Refactor envdir + resolve to not call resolve directly but instead update the environment information/state directly.

I'll also note that looking at the code for activate! (which PythonCall also calls using the CondaPkg backend), envdir is called again.

EDIT: I just noticed the STATE.frozen check that provides a bail out as well. I think that might provide some help as well, but envdir would still need to have some mechanism for auto-detecting the environment... #115

@cjdoris
Copy link
Collaborator

cjdoris commented Aug 13, 2024

You can already do something similar to Suggestion 1 with the env preference, which controls where the Conda environment gets put. However it does not change where the meta_dir itself is put - we could add a similar option for that too.

I'd like to extend both options to treat # specially similar to LOAD_PATH - e.g. so that JULIA_CONDAPKG_ENV=@# would expand to ~/.julia/conda_environments/{hash} where {hash} is a unique hash of the full path of the Julia project we're in (top_env in the code). This would mean we still get separate Conda envs for each Julia project without having to store them within the project itself. (pypoetry does a very similar thing when creating virtual environments.)

@cjdoris
Copy link
Collaborator

cjdoris commented Aug 13, 2024

For Suggestion 2 - I'm reluctant to disable the pidlock in offline mode, because some other process might be running in online mode and write to it. However we could add an option to explicitly disable it that you could use in tandem with offline mode.

@cjdoris
Copy link
Collaborator

cjdoris commented Aug 13, 2024

For Suggestion 3 - envdir needs to call resolve because it guarantees that the env is resolved when it returns. However I think the other suggestions should be sufficient to solve your issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants