Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Call the kaleido binary directly (fixes permission errors on Windows and Julia 1.10) #17

Merged
merged 10 commits into from
Mar 3, 2024

Conversation

disberd
Copy link
Member

@disberd disberd commented Feb 28, 2024

Since 1.10 there have been various issue with PlotlyKaleido on windows.
One of the encountered issues seems to be related to artifacts permissions.

I had some tests on my windows machine with julia 1.10.1 where Kaleido fails to start due to a permission error:

julia> using Kaleido_jll

julia> run(`$(joinpath(Kaleido_jll.artifact_dir, "kaleido.cmd")) plotly`)
Access is denied.
The system cannot find the path specified.
ERROR: failed process: Process(`'C:\Users\Alberto.Mengali\.julia\artifacts\7914a56da888d6a06d00c87f97e873c60e97acc7\kaleido.cmd' plotly`, ProcessExited(1)) [1]

Stacktrace:
 [1] pipeline_error
   @ .\process.jl:565 [inlined]
 [2] run(::Cmd; wait::Bool)
   @ Base .\process.jl:480
 [3] run(::Cmd)
   @ Base .\process.jl:477
 [4] top-level scope
   @ REPL[11]:1

I noticed that I only get permission denied if trying to execute the kaleido.cmd script, and not the kaleido executable directly.

This is the content of kaleido.cmd on windows:

@echo off
setlocal
chdir /d "%~dp0"
.\bin\kaleido.exe %*

which does not seem to do anything really useful for our use case where we can point directly to the path of the executable.

This PR changes the path of the bin used by PlotlyKaleido to point to kaleido.exe directly instead of going through the script file above, hoping it will fix this error with permission on windows

@disberd
Copy link
Member Author

disberd commented Feb 28, 2024

So the weirdness of kaleido on windows persists.

We have two independent problems which do not always appear when trying to use Kaleido on windows:

  1. The first is a problem with permission denied while trying to execute the kaleido.cmd. This is most likely caused by artifacts being read-only in Windows since 1.10 (Permissions issues with artifacts on Windows in 1.10-rc1 (and betas) - Access is denied JuliaLang/julia#52272).
  2. The second problem is that on some windows machine, version 0.2.1 of kaleido.exe just hangs making it effectively unusable, as the package will simply wait forever when reaching this line:
    res = readline(P.stdout) # {"code": 0, "message": "Success", "result": null, "version": "0.2.1"}
    This problem has multiple issues both on the Kaleido library itself and in discourse/other packages (see Image generation hangs on windows 10 plotly/Kaleido#110 for the most relevant issue in the main Kaleido repo).
    An often suggested solution to fix this problem is to downgrade Kaleido to v0.1.0

This PR tries to fix the first problem, but unfortunately effectively fails to do so because the kaleido executable looks for files using relative paths and assuming that the executable is ran from the kaleido folder (specifically, it looks for ./js/kaleido_scopes.js).

For this reason the current PR does not fix the problem as the library (kaleido.exe) called directly will search for this file in the folder where the process is ran.

I tried changing to the artifact folder before running the process but unfortunately I can't even cd to the artifact folder with the new read-only permissions of artifacts :(

@montyvesselinov
Copy link

The problem is that windows does not allow you to chdir to the cmd folder because it is read only. There should be a way to avoid doing chdir in to read-only folder.

@disberd
Copy link
Member Author

disberd commented Feb 29, 2024

I did try just creating a temporary folder to cd to where I only copy the contents of artifact_dir/js/* (which is only one file), and the version file, which I assume is only used to parse the current version of the library, but I had problems on the tests (probably also because I had to revert to [email protected] as 0.2.1 has the other issue of hanging).

I can try pushing a version that creates this temporary folder that contains the required local dependency (two very small files). It is an ugly workaround but hopefully the permissions of artifact in 1.10 will get fixed down the line and it seems to be the only workable solution to make windows work in the meantime

@disberd
Copy link
Member Author

disberd commented Mar 1, 2024

A positive update on the matter.

I realized that the Cmd objects in julia also have a documented field dir for specifying from which directory the command should be ran.
Based on the guidelines on appropriate ways to execute artifacts binaries in this comment, I modified the way kaleido is called (on all systems) to just using the binary command provided by the kaleido() function exported by Kaleido_jll and execute it explicitly from the artifact directory.

This seems to work fine in all systems even with read-only permission in 1.10 and seems to me like the cleanest solution.

I also encountered issues with the @test_nowarn PlotlyKaleido.start() on Windows so I removed that (reverting to only PlotlyKaleido.start()) on windows. Not sure what was that test aimed at.

Finally I updated the readme to hopefully make people more aware of possible the issue of the kaleido binary hanging on windows for version 0.2.1 of the kaleido library.

Edit: There were also other errors yesterday caused by an issue on PlotlyLight that was promptly fixed so now everything is finally green :D

@disberd disberd changed the title point to kaleido.exe directly on windows Call the kaleido binary directly (fixes permission errors on Windows and Julia 1.10) Mar 1, 2024
@disberd
Copy link
Member Author

disberd commented Mar 1, 2024

I think this is now ready for review @BeastyBlacksmith

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was the error message here?

Copy link
Member Author

@disberd disberd Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you are referring to the removal of @test_nowarn, you can check the error in the latest CI on windows here but I will also paste it here:

Error
Start: Error During Test at D:\a\PlotlyKaleido.jl\PlotlyKaleido.jl\test\runtests.jl:9
  Got exception outside of a @test
  IOError: unlink("C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\jl_WJuNMdSL23"): resource busy or locked (EBUSY)
  Stacktrace:
    [1] uv_error
      @ .\libuv.jl:100 [inlined]
    [2] unlink(p::String)
      @ Base.Filesystem .\file.jl:978
    [3] rm(path::String; force::Bool, recursive::Bool)
      @ Base.Filesystem .\file.jl:283
    [4] macro expansion
      @ C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Test\src\Test.jl:906 [inlined]
    [5] macro expansion
      @ D:\a\PlotlyKaleido.jl\PlotlyKaleido.jl\test\runtests.jl:13 [inlined]
    [6] macro expansion
      @ C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Test\src\Test.jl:1577 [inlined]
    [7] top-level scope
      @ D:\a\PlotlyKaleido.jl\PlotlyKaleido.jl\test\runtests.jl:13
    [8] include(fname::String)
      @ Base.MainInclude .\client.jl:489
    [9] top-level scope
      @ none:6
   [10] eval
      @ .\boot.jl:385 [inlined]
   [11] exec_options(opts::Base.JLOptions)
      @ Base .\client.jl:291
   [12] _start()
      @ Base .\client.jl:552
Test Summary: | Pass  Error  Total  Time
Start         |    1      1      2  6.9s
ERROR: LoadError: Some tests did not pass: 1 passed, 0 failed, 1 errored, 0 broken.
in expression starting at D:\a\PlotlyKaleido.jl\PlotlyKaleido.jl\test\runtests.jl:9
┌ Warning: Failed to clean up temporary path "C:\\Users\\RUNNER~1\\AppData\\Local\\Temp\\jl_WJuNMdSL23"
│ Base.IOError("unlink(\"C:\\\\Users\\\\RUNNER~1\\\\AppData\\\\Local\\\\Temp\\\\jl_WJuNMdSL23\"): resource busy or locked (EBUSY)", -4082)
└ @ Base.Filesystem file.jl:549
ERROR: LoadError: Package PlotlyKaleido errored during testing
Stacktrace:
 [1] pkgerror(msg::String)
   @ Pkg.Types C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\Types.jl:70
 [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
   @ Pkg.Operations C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\Operations.jl:2018
 [3] test
   @ C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\Operations.jl:1899 [inlined]
 [4] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Vector{String}, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::@Kwargs{io::IOContext{Base.PipeEndpoint}})
   @ Pkg.API C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\API.jl:444
 [5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{Base.PipeEndpoint}, kwargs::@Kwargs{coverage::Bool, julia_args::Vector{String}, force_latest_compatible_version::Bool})
   @ Pkg.API C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\API.jl:159
 [6] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::@Kwargs{coverage::Bool, julia_args::Vector{String}, force_latest_compatible_version::Bool})
   @ Pkg.API C:\hostedtoolcache\windows\julia\1.10.2\x64\share\julia\stdlib\v1.10\Pkg\src\API.jl:174
 [7] top-level scope
   @ D:\a\_actions\julia-actions\julia-runtest\v1\test_harness.jl:15
 [8] include(fname::String)
   @ Base.MainInclude .\client.jl:489
 [9] top-level scope
   @ none:1
in expression starting at D:\a\_actions\julia-actions\julia-runtest\v1\test_harness.jl:7
Error: Process completed with exit code 1.

I was also quite baffled by this error and I kept getting it also locally. But calling PlotlyKaleido.start() without @test_nowarn does not produce the warning triggering the error..

At some point I thought it might be some quirk of downgrading kaleido_jll as part of the runtest file but at least locally I tried setting the Project compat on Kaleido_jll to version 0.1 but I still kept getting that error with @test_nowarn

@disberd
Copy link
Member Author

disberd commented Mar 2, 2024

I also added a small function to avoid blocking the process forever when finding the situation in #13 and throwing an error with some suggested fix if the process seems to hang more than 5 seconds.

This should actually fix #13

The PR itself should also fix #10

@BeastyBlacksmith
Copy link
Member

Thanks for all the detective work! ❤️
Would you be interested in becoming a maintainer?

@BeastyBlacksmith BeastyBlacksmith merged commit 8a36dab into JuliaPlots:main Mar 3, 2024
6 checks passed
@disberd
Copy link
Member Author

disberd commented Mar 3, 2024

Thanks for all the detective work! ❤️
Would you be interested in becoming a maintainer?

Thanks @BeastyBlacksmith, yeah I'd be happy to give a hand as maintainer when possible

@montyvesselinov
Copy link

There still seems to be a problem. Do we still have this error message?!

julia> import PlotlyJS
ERROR: InitError: It looks like the kaleido process is hanging.
If you are on windows this might be caused by known problems with Kaleido v0.2 on windows.
You might want to try forcing a downgrade of the kaleido library to 0.1.
Check the Package Readme at https://github.com/JuliaPlots/PlotlyKaleido.jl/tree/main#windows-note for more details
Stacktrace:
  [1] error(s::String)
    @ Base .\error.jl:35
  [2] readline_noblock
    @ C:\Users\monty\.julia\packages\PlotlyKaleido\aujjS\src\PlotlyKaleido.jl:52
  [3] #start#6
    @ C:\Users\monty\.julia\packages\PlotlyKaleido\aujjS\src\PlotlyKaleido.jl:128
  [4] start
    @ C:\Users\monty\.julia\packages\PlotlyKaleido\aujjS\src\PlotlyKaleido.jl:59 [inlined]
  [5] __init__()
    @ PlotlyJS C:\Users\monty\.julia\packages\PlotlyJS\b6MbQ\src\PlotlyJS.jl:104
  [6] run_module_init(mod::Module, i::Int64)
    @ Base .\loading.jl:1134
  [7] register_restored_modules(sv::Core.SimpleVector, pkg::Base.PkgId, path::String)
    @ Base .\loading.jl:1122
  [8] _include_from_serialized(pkg::Base.PkgId, path::String, ocachepath::String, depmods::Vector{Any})
    @ Base .\loading.jl:1067
  [9] _require_search_from_serialized(pkg::Base.PkgId, sourcepath::String, build_id::UInt128)
    @ Base .\loading.jl:1581
 [10] _require(pkg::Base.PkgId, env::String)
    @ Base .\loading.jl:1938
 [11] __require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1812
 [12] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [13] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [14] _require_prelocked(uuidkey::Base.PkgId, env::String)
    @ Base .\loading.jl:1803
 [15] macro expansion
    @ .\loading.jl:1790 [inlined]
 [16] macro expansion
    @ .\lock.jl:267 [inlined]
 [17] __require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1753
 [18] #invoke_in_world#3
    @ .\essentials.jl:926 [inlined]
 [19] invoke_in_world
    @ .\essentials.jl:923 [inlined]
 [20] require(into::Module, mod::Symbol)
    @ Base .\loading.jl:1746
during initialization of module PlotlyJS

@disberd
Copy link
Member Author

disberd commented Mar 3, 2024

Is this with Kaleido_jll version 0.1?

What I didn't do in this PR is lower the Kaleido_jll compat to 0.1 as that would downgrade version also for Linux and Mac where it is not necessary

@montyvesselinov
Copy link

montyvesselinov commented Mar 4, 2024 via email

@disberd
Copy link
Member Author

disberd commented Mar 4, 2024

No. I am using 0.2. Should I switch to 0.1?! I think the goal was to make
0.2 to work.

No on windows (where it has the hanging issue) unfortunately that is a problem that apparently requires fixing the kaleido C library itself, which does not seem to be feasible in the near term (compiling kaleido locally is not for the faint of heart based on the repo Readme and having a PR reviewed, merged and released doesn't seem to be possible in a short term given the recent history of the library)

What this PR did was fix permission errors (a different problem) and avoid stalling the process forever by throwing an error suggesting to downgrade to 0.1 on windows.

@montyvesselinov
Copy link

Ok. However, throwing an error in this case is a problem for me. We use features in PlotlyJS that are unrelated to Kaleido. We develop a commercial code. this error message kills compilation processes. It also creates confusion.

Can we suppress the error message? Can we replace it with a warning?

@disberd
Copy link
Member Author

disberd commented Mar 4, 2024

@montyvesselinov I see your point and while I do want your problem to be fixed, I do not think that not throwing an error here is the correct solution. I believe that Kaleido should not be started by default in PlotlyJS if it's not being used by the user (as it's your case).
But maybe there are sensible reasons why this is the case now.
I opened an issue on PlotlyJS directly and hopefully we'll be able to converge to a solution soon.

Notice that you may momentarily fix the precompilation error by forcing v0.1 of the Kaleido_jll library in the project you are using PlotlyJS in. If you are not gonna use it anyhow it won't matter for the results, and it won't even impact your loading TTFX time as the Kaleido_jll is loaded anyhow when starting PlotlyJS at the moment

@montyvesselinov
Copy link

Thank you! PlotlyKaleido should not cause PlotlyJS to fail. Or there should be an option to disable PlotlyKaleido if PlotlyKaleido fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PlotlyKaleido.start() sometimes hangs on windows Kaleido startup failed
3 participants