Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

julia PyPlot support missing data #593

Open
deszoeke opened this issue Jul 12, 2024 · 2 comments · May be fixed by #596
Open

julia PyPlot support missing data #593

deszoeke opened this issue Jul 12, 2024 · 2 comments · May be fixed by #596

Comments

@deszoeke
Copy link

Passing a value of missing, a first class type in Julia Base, to matplotlib should simply not the plot missing data. It should handle missing just like it presently does NaN.

However, passing a missing to matplotlib results in long and unhelpful errors from PyPlot and matplotlib:

plot([0,missing,2])

ERROR: PyError ($(Expr(:escape, :(ccall(#= ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'TypeError'>
TypeError("float() argument must be a string or a real number, not 'PyCall.jlwrap'")
  File "mypath/python3.12/site-packages/matplotlib/pyplot.py", line 3794, in plot
    return gca().plot(
           ^^^^^^^^^^^
  File "/mypath/python3.12/site-packages/matplotlib/axes/_axes.py", line 1781, in plot
    self.add_line(line)
  File "/mypath/python3.12/site-packages/matplotlib/axes/_base.py", line 2339, in add_line
    self._update_line_limits(line)
  File "/mypath/python3.12/site-packages/matplotlib/axes/_base.py", line 2362, in _update_line_limits
    path = line.get_path()
           ^^^^^^^^^^^^^^^
  File "/mypath/python3.12/site-packages/matplotlib/lines.py", line 1037, in get_path
    self.recache()
  File "/mypath/python3.12/site-packages/matplotlib/lines.py", line 679, in recache
    y = _to_unmasked_float_array(yconv).ravel()
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mypath/python3.12/site-packages/matplotlib/cbook.py", line 1398, in _to_unmasked_float_array
    return np.asarray(x, float)
           ^^^^^^^^^^^^^^^^^^^^

Stacktrace:
  [1] pyerr_check
    @ ~/.julia/packages/PyCall/1gn3u/src/exception.jl:75 [inlined]
  [2] pyerr_check
    @ ~/.julia/packages/PyCall/1gn3u/src/exception.jl:79 [inlined]
  [3] _handle_error(msg::String)
    @ PyCall ~/.julia/packages/PyCall/1gn3u/src/exception.jl:96
  [4] macro expansion
    @ ~/.julia/packages/PyCall/1gn3u/src/exception.jl:110 [inlined]
  [5] #107
    @ ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:43 [inlined]
  [6] disable_sigint
    @ ./c.jl:473 [inlined]
  [7] __pycall!
    @ ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:42 [inlined]
  [8] _pycall!(ret::PyObject, o::PyObject, args::Tuple{Vector{Union{Missing, Int64}}}, nargs::Int64, kw::Ptr{Nothing})
    @ PyCall ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:29
  [9] _pycall!
    @ ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:11 [inlined]
 [10] pycall
    @ ~/.julia/packages/PyCall/1gn3u/src/pyfncall.jl:83 [inlined]
 [11] plot(args::Vector{Union{Missing, Int64}}; kws::@Kwargs{})
    @ PyPlot ~/.julia/packages/PyPlot/rWSdf/src/PyPlot.jl:194
 [12] plot(args::Vector{Union{Missing, Int64}})
    @ PyPlot ~/.julia/packages/PyPlot/rWSdf/src/PyPlot.jl:190
 [13] top-level scope
    @ REPL[56]:1

For years I have wrapped my inputs to matplotlib in the helper function

m2n(x) = ismissing(x) ? NaN : x

It is awkward.

I propose then, that

  1. missing values are simply (not) plotted, just like NaNs.
  2. a brief warning catch and warn the user of missings in the data.

The proposed change will make PyPlot more compatible with Julia and Plots.jl, cf.
JuliaPlots/Plots.jl#1706

I expect this will not break most cases, because missing is just a newer use pattern than NaN for missing data.

@Alexander-Barth
Copy link

This would be nice to have. See also this issue here:
JuliaPy/PyCall.jl#616

One can also used masked arrays as shown in this example if one wants to avoid a promotion of integers to floats.

@deszoeke
Copy link
Author

deszoeke commented Dec 9, 2024

Since I see this problem (JuliaPy/PyCall.jl) only when using PyPlot, adding the PyObject method is a PyPlot pull request.

@Alexander-Barth's solution works inside the PyPlot module:

using PyCall
using PyCall: PyObject

# extend PyObject to maskedarray to allow for plotting with missing values
function PyObject(a::Array{Union{T,Missing},N}) where {T,N}
    numpy_ma = PyCall.pyimport("numpy").ma
    pycall(numpy_ma.array, Any, coalesce.(a,zero(T)), mask=ismissing.(a))
end
# test missing support
using PyPlot
x = [missing, 1, 2, 3, 4] 
y = [1, 2, missing, 2, 3]
plot(x, y, marker=".")

@deszoeke deszoeke linked a pull request Dec 10, 2024 that will close this issue
deszoeke added a commit to deszoeke/PyPlot.jl that referenced this issue Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants