-
-
Notifications
You must be signed in to change notification settings - Fork 269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendoring Python Wheels as Artifacts #1439
Comments
This is very interesting and could be extremely useful in ensuring that we have reproducible scientific code when using Python. Have you considered using conda in addition to wheels? There are numerous conda packages that do not have not had a wheel equivalent. Historically at least, conda has better supported scientific computing packages compared to PyPi. However, wheels have made this a lot better for users in the recent years. Conda appears to have a similar API as well, but I'm not sure if you need to bundle a Also, I'm not sure how exactly wheels deal with non Python dependencies. My understanding is that they vendor all dependencies in the wheel itself. So in a situation where, say in Julia one wanted to interface with a Python package that uses the C API to interface with native libraries depend on Boost or ZeroMQ or other non Python libraries; are you suggesting building those non Python dependencies as separate |
Note that a subset of "System design constraints" is already possible with combination of For PyCall and its downstreams, a fundamental building block we need is the package options to configure
Why not use Also, I think one step missing is resolution of Python package dependencies. Dependency resolution for Python packages is a hard problem because (IIRC) there is no central repository recording the entire dependency graph; you have to download the package to figure out its dependency. Re-implementing this sounds like a lot of duplicated effort. I think installing Python with |
We have a pretty good python interop story, but it lacks many of the reproducibility guarantees that Pkg3 has; in particular, when using python packages from the system, or even from
Conda.jl
, because the python dependencies are managed separately from the Julia dependencies, it is possible for Julia and python packages to get out of sync and break. To resolve this, I propose a technique for creating python "virtual environments" for more fully controlled python installations.System design constraints:
pkg> instantiate
should "just work". No matter how much time has passed, you must be able to get back the same python packages as you had before, so that PyCall, IJulia, etc... can all "just work" far off into the future, no matter how much breaking progress the python ecosystem experiences.Upgrading Python packages should be simple. Not via
pkg> upgrade
, but through some relatively simple mechanism.Isolation from the system. System python packages should not interfere or aid in these packages at all.
Reading this list of design constraints, you might think that this sounds an awful lot like what I've been working on towards JLL packages/Pkg Artifacts, and you would be correct. At least I'm consistent in the kinds of ideas I come up with. Since Artifacts are the 'marteau du jour', as it were, let's recklessly apply them here and see what kind of a system we can create:
Bundle a python interpreter as an artifact, e.g.
Python_jll
. Not too difficult.Translate python packages into artifacts. something like
translate_py_pkg(name::String, version = nothing)
would hitPyPI
's JSON API for a listing of versions, generate anArtifacts.toml
entry for that python package by downloading, extracting and tree-hashing the python package..zip
support for this....)Once python packages are being downloaded as artifacts, we set
PYTHONPATH
appropriately before loadinglibPython
or invokingpython
, so that these packages are being found properly.Future invocations of the Julia package manager will see these binary blobs that are attached to the current project, and will properly re-instantiate them from PyPI.
There's some subtlety here related to the implicit Python compiler ABI. In particular, on Windows, they assume usage of MSVC, which is fine, except when you start compiling C++ code. It's highly unlikely that Python wheels that contain C++ code will link properly to Julia. This has never and probably will never worked though, so we don't lose that much here. C and FORTRAN code should work together just fine, so we should be okay in 95% of what we want to do, and if you want to do something more complicated, you can always just spin up a Python interpreter compiled properly and communicate over a socket.
The text was updated successfully, but these errors were encountered: