-
Notifications
You must be signed in to change notification settings - Fork 65
support datalimits and respect area in violin #730
Conversation
New example (included in the docs) using GLMakie
N = 1000
xs = rand(["a", "b", "c"], N)
dodge = rand(1:2, N)
side = rand([:left, :right], N)
colors = map(side) do s
return s == :left ? "orange" : "teal"
end
ys = map(side) do s
return s == :left ? randn() : rand()
end
violin(xs, ys, dodge = dodge, side = side, color = colors, datalimits = extrema) |
src/stats/violin.jl
Outdated
return (rgba.r, rgba.g, rgba.b, rgba.alpha) | ||
end | ||
|
||
sa = StructArray((x = x̂, side = sides, color = colors)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if one should also group by color
here, it is a bit of an odd interface to have to pass a vector of colors that is the same length as a vector of sides. Maybe there could be attributes leftcolor
and rightcolor
instead?
This is mergeable already, apart from an API decision. That is, given that it is better to pass both sides of the violin at the same time (for proper normalization), how do we pass colors for the two sides? It seems to me that this can be done in two possible ways:
Currently this implements 1., but it should be straightforward to port it to 2. (which would be my current preference, but I'm not 100% sure). |
Maybe a tuple of colors for the sides? I'd like to restrict color passing to the |
Thing is, one could pass either a unique color (the same for both sides) or separate colors, so tuples do create some ambiguity (we already use it for transparency). I would be OK with a |
Yeah... and then: color = (left = [:red, ...] , right = [:green, ...]) ? |
Ah, if you want each of the different left violins to be of a different color? Yes, that should work automatically, because the value of |
9905137
to
217c5d2
Compare
xs = rand(1:3, N) | ||
dodge = rand(1:2, N) | ||
side = rand([:left, :right], N) | ||
color = Observable((left = :orange, right = :teal)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Passing directly the named tuple, ie color = (left = :orange, right = :teal)
causes a bug with the theme merging mechanism, that is:
julia> violin(xs, ys, side = side, color = (left=:orange, right=:teal))
ERROR: Type missmatch while merging plot attributes with theme for key: color.
Found Observable{Any} with 0 listeners. Value:
:black in theme, while attributes contains: Attributes with 2 entries:
left => orange
right => teal
I imagine that should be fixed independently in AbstractPlotting?
Can you check how many points the violins have? I remember dialing down density because it was a very high number, leading to CairoMakie slowness. But maybe there's also something slow in the CairoMakie code |
Should be good to merge if CI passes, other than the concern in https://github.com/JuliaPlots/AbstractPlotting.jl/pull/730/files#r628403782, ( |
I also brought down julia> @time display(violin(rand(1:2, 100), rand(100), color = :red));
0.170620 seconds (289.41 k allocations: 14.615 MiB, 63.42% compilation time)
julia> @time display(violin(rand(1:2, 100), rand(100), color = [:red, :blue]));
4.379037 seconds (423.27 k allocations: 18.984 MiB, 2.51% compilation time) In some cases I get over two order of magnitudes difference, I suspect it is a performance bug somewhere. |
then it's probably hitting a mesh path, where it should hit a specialized poly path. i'll check it out |
ok yeah it's definitely hitting mesh, but it seems the time is spent in Cairo.finish, and png is still relatively fast. so an svg thing at its core, although we should avoid the meshes anyway |
oof that was actually a pretty bad bug I think, it probably drew multiple meshes for each poly (which is why the edges were jagged, because of overdrawing antialiasing.. I thought this was just a mesh vs poly artifact) So anyway, with this PR JuliaPlots/CairoMakie.jl#159 the time for this snippet: polys = [decompose(Point2f0,
Circle(Point2f0(i, 0), 0.3)) for i in 1:10]
f, ax, p = poly(polys, color = rand(RGBf0, 10))
@time display(f) goes from |
About the colors, I think I'd prefer the vector of sides, vector of colors approach. That way you can color each density however you want and violin is not the only recipe with a weird named tuple exception for attributes. It could be complicated to treat named tuples in a special way for attribute conversion, as we pass attributes to Axis and Figure that way right now, and there are similar cases all over I think. In the simples case the user passes only one side and one color anyway because most users don't touch array attributes in the beginning. And once they understand the array attributes, vector of sides, vector of colors would not be that difficult. Maybe we could also just call |
Marked as draft cause it seems like we still need to decide the API.
Violin does not yet support horizontal mode, but it definitely should (in which case one no longer really needs
As for how to pass the colors. I have the following doubt. If one has to pass a vector of colors of the same length as the vector of sides, what happens if they don't "match"? That is, if the user passes: violin([1, 1, 1, 1], rand(4), side = [:left, :left, :right, :right], color = [:red, :blue, :red, :blue]) Should one group by the three variables ( Or should the length of the |
I think it's completely fine for them not to match. Why should a user not be able to plot violins of arbitrary sides and colors? They wouldn't in the typical case but that doesn't mean it should be impossible. For example, someone might want to highlight one particular violin. So I think colors and sides should be scalar or the same length as x/y. We can also easily support all three sides types at once and you're right, that kind of makes density obsolete or redundant. We can think of just rerouting density to violin then maybe? Although the default for violin is double sided and vertical for me, while density is one sided and horizontal. |
You are probably right, in that in practice when you're doing a violin, your data will be in a table and it's easy to create these vectors with
I also think that's perfectly fine, what I meant is that, if one goes for "colors and sides should be scalar or the same length as x/y", there is the option that the user may give different colors for the same value of The only ways to do something sensible IMO if the user passes a vector of colors that is not a function of
Option 1. was the initial implementation, so if we are happy with that I can just revert the last commit, otherwise I can revert the last commit and add an extra check.
I feel we'll need to consolidate the API a bit. We have That definitely belong to a separate PR though, as this one is mostly about fixing the rescaling. |
Went for the conservative API (option 2 above), I think it makes the most sense. I've also added an example where all violins have different colors, to show that it is possible. Should be good to go if docs build correctly! |
Looks like docs built fine. Link to new violin page: https://makie.juliaplots.org/previews/PR730/plotting_functions/violin.html |
Nice, I think the conservative choice is good |
Thanks a lot :) |
From a discussion on discourse (https://discourse.julialang.org/t/side-by-side-violin-plots-with-vegalite-jl/60523/2?u=piever) as well as a conversation on slack (copied below to avoid losing it), it seems like user expects violin plots to be renormalized together, in that all densities are rescaled by the same factor to fit in the allocated width.
This PR allows to pass both sides together in order to normalize correctly, and allows "trimming options" (as in
datalimits=(0, Inf)
ordatalimits=extrema
).Main issue is that this way both sides of the violin have to be plotted in the same call (to be able to figure out the correct rescaling factor).
Main issues so far:
poly!
with a vector of colors (as many as there are polygons) seems very slow.GLMakie somehow has trouble drawing the(ups, had the wrong AbstractPlotting checked out, things actually work fine in GLMakie)poly!
with different colors (one per polygon), and draws them transparent, really not sure whycc: @sethaxen @yakir12
Slack chat for posterity