Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conda support #18

Closed
cnuernber opened this issue Dec 11, 2019 · 11 comments
Closed

Conda support #18

cnuernber opened this issue Dec 11, 2019 · 11 comments
Labels
harder Harder issue help wanted Extra attention is needed
Milestone

Comments

@cnuernber
Copy link
Collaborator

We need help with this. It is possible, it is a matter of figuring out the changes that conda makes to the system and what env variables the conda python shim sets up.

This is very important for newcomer experience so if you know anyone who knows how conda works or want to spend some time working with conda and libpython-clj please jump right in.

@cnuernber cnuernber added help wanted Extra attention is needed harder Harder issue labels Dec 11, 2019
@cnuernber cnuernber added this to the environments milestone Dec 11, 2019
@cnuernber
Copy link
Collaborator Author

cnuernber commented Dec 11, 2019

The best workaround I have so far:

  1. Activate your environment - conda activate envtest
  2. Export python home to the config prefix - export PYTHONHOME=$(python3-config --prefix)

3a. Export JVM_OPTS so leiningen picks them up:

JVM_OPTS="-Djava.library.path=$(python3-config --prefix)/lib" lein repl

3b. Update your shared library paths to similar to 3a.

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:"$(python3-config --prefix)/lib"

3c. Explicitly load the library:

user> (require '[libpython-clj.python :as py])
nil
user> (require '[clojure.java.shell :as sh])
nil
user> (sh/sh "python3" "--version")
{:exit 0, :out "", :err "Python 3.6.9 :: Anaconda, Inc.\n"}
user> (System/getenv "PYTHONPATH")
nil
user> (System/getenv "PYTHONHOME")
"/home/chrisn/.conda/envs/envtest"
user> (py/initialize! :library-path (format "%s/lib/libpython3.6m.so" 
                                            (System/getenv "PYTHONHOME")))
:ok
user> (py/import-module "numpy")
<module 'numpy' from '/home/chrisn/.conda/envs/envtest/lib/python3.6/site-packages/numpy/__init__.py'>

@cnuernber
Copy link
Collaborator Author

Conda now works out of the box in a test with no environment variable setting like above. Leaving issue open for a bit to get more feedback.

@klausharbo
Copy link

klausharbo commented Dec 12, 2019

I added support for conda-installing Clojupyter earlier this year. I'm not sure exactly what you'll be trying to accomplish, but some it might be useful. Even though I only use a few of the environment variables, I added access to all of them in env.clj.

-Klaus.

@cnuernber
Copy link
Collaborator Author

@klausharbo - Great to hear from you!

What we needed to do was to find the python shared libraries when conda was activated and initialize them with enough information that we can load the same python modules and you would be able to were you to activate the environment and run the python REPL.

I am having a bit of trouble full comprehending that file you linked to. These are variables that you expect to find in your environment when conda installs clojupyter I guess as part of the install process?

@klausharbo
Copy link

klausharbo commented Dec 13, 2019

I was probably too fast in posting - sorry for being obtuse. :-) I did struggle quite a bit in
getting to grips with Conda, and I'm not sure I understand very well even now, but I can share what
I found.

First off, I found the Conda Deep Dive video helpful in explaining the fundamental concepts. There's quite a bit of Conda documentation, but I have found it hard to understand - it simply doesn't always seem to address the questions I have. It may very well be because I don't really know Python and so the finer points are lost on me.

My notes recapping some of the early slides in the video:

Package

  • index.json
    • The source of truth for things like package name and version.
    • Without this, it is not a conda package. Used to build
      repodata.json.
  • info/paths.json
    • Per-path meta-data
  • info/link.json
    • Install-time meta-data. Optional.
    • Special instructions on how to install package
    • Description of whole package, not per-path
  • info/repodata_record.json
    • Not part of package
    • Created immediately after conda extracts the package in the
      package cache
    • Looks a lot like index.json but has extra information:
      channel, MD5 of tar, origin URL

Channel

  • Example: https://repo.anaconda.org/pkgs/main
  • A channel is a collection of packages
  • Channels have 'subdirs'
  • Groups of platform-specific packages: linux-64, osx-64, win-64
  • noarch is the universal and common subdir among all platforms
    • Used to detect that a channel is in fact a channel
  • Each subdir has a repodata.json file: A manifest of a packages in
    the subdir
  • New type of file: channeldata.json
    • Sits in channel directory, describes channel
  • Channel URL
    • scheme, auth, location, token, name, subdir, package\filename

Environment

  • Based on (uses) UNIX FHS (File Hierarchy Standard)
  • conda-meta directory: Defines that a directory is a conda environment
  • conda-meta/history lists all operations performed in the environment since its creation
  • JSON file for every package in the environment, contains all metadata about the package

The obscure env.clj I linked to above is based on conda-build Environment Variables. On second thought I don't think it's very relevant here: Conda uses environment variables to control what's going on, but these have to do with building a package (like Clojupyter), whereas what I think you guys are trying to is use Python in a Conda-controlled context.

For that I suspect the key environment variable is CONDA_PREFIX, which is set to point to the active environment.

In my setup right now (I recently switched to Miniconda since I don't need all things included in Anaconda):

(base) locke:~/lab/clojupyter
08:54 > conda info --envs
# conda environments:
#
                         /Users/klaush/anaconda3
base                  *  /Users/klaush/miniconda3
env1                     /Users/klaush/miniconda3/envs/env1
env2                     /Users/klaush/miniconda3/envs/env2

(base) locke:~/lab/clojupyter
08:54 >

where we have

(base) locke:~/lab/clojupyter
08:54 > echo $CONDA_PREFIX
/Users/klaush/miniconda3
(base) locke:~/lab/clojupyter
08:55 >

but doing

08:55 > conda activate env1
(env1) locke:~/lab/clojupyter
08:56 >

means we get

08:56 > echo $CONDA_PREFIX
/Users/klaush/miniconda3/envs/env1
(env1) locke:~/lab/clojupyter
08:56 >

Depending on how much detail you need, I suspect $CONDA_PREFIX/conda-meta/index.json could be interesting (mentioned under Package above). (Edited: strikethrough)

@klausharbo
Copy link

It doesn't look like my last comment about $CONDA_PREFIX/conda-meta/index.json is correct. index.json is a package concept, not an environment one :-(.

@cnuernber
Copy link
Collaborator Author

Your perspective is correct; we are working with the runtime environment, not attempting to build a package.

@cnuernber
Copy link
Collaborator Author

Next problem:

:ok
libpython-clj.python> (def mxnet (import-module "mxnet"))
Execution error at libpython-clj.python.interpreter/check-error-throw (interpreter.clj:394).
Traceback (most recent call last):
  File "/home/chrisn/.conda/envs/pyclj/lib/python3.6/site-packages/mxnet/__init__.py", line 24, in <module>
    from .context import Context, current_context, cpu, gpu, cpu_pinned
  File "/home/chrisn/.conda/envs/pyclj/lib/python3.6/site-packages/mxnet/context.py", line 24, in <module>
    from .base import classproperty, with_metaclass, _MXClassPropertyMetaClass
  File "/home/chrisn/.conda/envs/pyclj/lib/python3.6/site-packages/mxnet/base.py", line 213, in <module>
    _LIB = _load_lib()
  File "/home/chrisn/.conda/envs/pyclj/lib/python3.6/site-packages/mxnet/base.py", line 204, in _load_lib
    lib = ctypes.CDLL(lib_path[0], ctypes.RTLD_LOCAL)
  File "/home/chrisn/.conda/envs/pyclj/lib/python3.6/ctypes/__init__.py", line 348, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: libmkldnn.so.0: cannot open shared object file: No such file or directory   

@cnuernber
Copy link
Collaborator Author

Solved by using LD_LIBRARY_PATH in lein repl startup script.

@cnuernber
Copy link
Collaborator Author

This (LD_LIBRARY_PATH) is as good as we can get:

For a working pathway, use the conda docker container.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
harder Harder issue help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants