Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import inside py_capture_output breaks delay_load #1565

Closed
gdurif opened this issue Mar 28, 2024 · 6 comments
Closed

import inside py_capture_output breaks delay_load #1565

gdurif opened this issue Mar 28, 2024 · 6 comments

Comments

@gdurif
Copy link

gdurif commented Mar 28, 2024

I am using reticulate::import() with delay_load=TRUE to load a Python package in the .onLoad() function of my R package (so that user can setup their Python session, e.g. using a virtual environment, afterward).

Said Python package is printing to stdout when being imported, which I want to avoid because it is good practice that R package verbosity on loading could be disabled by the user (for quiet loading).

So I tried to wrap my reticulate::import() call inside a reticulate::py_capture_output() call, however the delay loading is not working anymore. Below is a minimum working example:

  • SciPy is not available, direct import fails as expected:
library(reticulate)
scipy <- import("scipy")
## Error in py_module_import(module, convert = convert) : 
##  ModuleNotFoundError: No module named 'scipy'
## Run `reticulate::py_last_error()` for details.
  • SciPy is not available yet but import with delay loading works as expected (e.g. I may call reticulate::use_virtualenv() to setup a Python virtual environment where SciPy is installed afterward):
library(reticulate)
scipy <- import("scipy", delay_load=TRUE)
  • Import with delay loading inside a py_capture_output() call does not work:
library(reticulate)
msg <- py_capture_output({scipy <- import("scipy", delay_load=TRUE)})
## Error in py_module_import(module, convert = convert) : 
##  ModuleNotFoundError: No module named 'scipy'
## Run `reticulate::py_last_error()` for details.

Note: this behavior is maybe expected since reticulate may need to initiate the Python session to call py_capture_output(). In that case, how could I avoid verbosity printing to stdout when importing a Python package during my R package loading ?

Thanks

Edit1: this is a minimum working example, SciPy import produces no output that I would need to capture. My R package is importing another Python package that produces output at import, namely pykeops.

Edit2: this is my R session info:

R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Arch Linux

Matrix products: default
BLAS:   /usr/lib/libblas.so.3.12.0 
LAPACK: /usr/lib/liblapack.so.3.12.0

locale:
 [1] LC_CTYPE=en_IE.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_IE.UTF-8        LC_COLLATE=en_IE.UTF-8    
 [5] LC_MONETARY=en_IE.UTF-8    LC_MESSAGES=en_IE.UTF-8   
 [7] LC_PAPER=en_IE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reticulate_1.35.0

loaded via a namespace (and not attached):
[1] compiler_4.3.2 Matrix_1.6-1.1 cli_3.6.2      Rcpp_1.0.12    grid_4.3.2    
[6] jsonlite_1.8.8 rlang_1.1.3    png_0.1-8      lattice_0.21-9
@t-kalinowski
Copy link
Member

What is the output you are trying to suppress? (import("scipy") produces no output for me).

@gdurif
Copy link
Author

gdurif commented Mar 28, 2024

@t-kalinowski This is a minimum working example, I am not importing SciPy in my package but another Python package. Sorry it was not clear, I edited my issue to specify it.

@t-kalinowski
Copy link
Member

t-kalinowski commented Mar 28, 2024

We don't currently have an official API for this. Thinking though how this could be implemented, an R package could probably do something like this (completely untested code):

.onLoad <- function(...) {
  if (reticulate::py_available())
    reticulate::py_capture_output({
      module <<- reticulate::import("module")
    })
  else {
    py_output_context_manager <- NULL
    module <<- import("module", delay_load = list(
      before_load = function() {
        reticulate::py_available(TRUE) # initialize python
        
        # get output tools helper functions
        output_tools <- import("rpytools.output")
        py_output_context_manager <<-
          output_tools$OutputCaptureContext(capture_stdout = TRUE, 
                                            capture_stderr = TRUE)
        
        py_output_context_manager$`__enter__`()
      }
      on_load = function() {
        py_output_context_manager$`__exit__`()
        py_output_context_manager <<- NULL
      },
      on_error = function(error) {
        py_output_context_manager$`__exit__`()
        py_colected_output <- py_output_context_manager$collect_output()
        py_output_context_manager <<- NULL
        
        error$message <- paste0(c(error$message, py_colected_output), sep = "\n")
        stop(error)
      }
    ))
  }
}

However, taking a look at 'pykeops' specifically, seems like this is all that's needed to suppress output:

Sys.setenv("PYKEOPS_VERBOSE" = "0")

@gdurif
Copy link
Author

gdurif commented Mar 29, 2024

Thanks a lot @t-kalinowski! I'll work on that.

Regarding the environment variable setup, I want to catch the output from Python and print it with packageStartupMessage() in R (so that it can be disabled or not using library(xxx, quietly = TRUE/FALSE)), and not just totally disable it.

@t-kalinowski
Copy link
Member

Worth mentioning if you're going to be setting env variables: if Python is initialized already, changes to environment variables made with Sys.setenv() don't automatically propagate to the Python session. (#1339)

So you'll want to do something like this:

Sys.setenv("PYKEOPS_VERBOSE" = "0")
if(py_available()) 
  import("os")$environ$update(list("PYKEOPS_VERBOSE" = "0"))

I might be mistaken, but I don't think library(xxx, quietly = TRUE/FALSE) suppresses output from packageStartupMessage(), users would need suppressPackageStartupMessages() for that.


With import(delay_load = TRUE), Python will not initialize during your packages .onLoad() call, meaning, you won't be able to capture output from python initializing and pass it packageStartupMessage() before .onLoad() had returned. I believe conventionally packageStartupMessage() is only called from .onLoad(). I don't know if that's just convention or if CRAN enforces it, but either way, I'd recommend a different approach.


I just pushed a small updated to OutputCaptureContext(), making it a little more ergonomic to use. (#1569)

@gdurif
Copy link
Author

gdurif commented Apr 2, 2024

@t-kalinowski thank you very much again.

You are right regarding suppressPackageStartupMessages() and I did not think about delay import that will not work with packageStartupMessage(). I will set the environment variable as you suggested.

@gdurif gdurif closed this as completed Apr 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants