-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import and extend PosteriorStats #431
base: master
Are you sure you want to change the base?
Conversation
# Compute the change rates. | ||
changerates, mvchangerate = changerate(chains) | ||
|
||
# Summarize the results in a named tuple. | ||
nt = (; zip(names_of_params, changerates)..., multivariate = mvchangerate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lacking a parameter
column meant the show
method was broken. But since there is a changerate for every parameter, it makes more sense to do the same thing as gelmandiag_multivariate
and return a SummaryStats
for the marginal values and return the multivariate changerate separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Genuine question: what is the change-rate in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From inspecting the code, it's, for each parameter and chain, the fraction of draws that are different from the previous draw. I suppose it's similar to "acceptance rate."
@devmotion @torfjelde This is ready for final review. |
Currently PosteriorStats has no |
Oh, I missed #431 (comment), probably due to the Christmas break. I added the PR to my todo list 🙂 |
@@ -64,7 +64,7 @@ const supportedplots = push!(collect(keys(translationdict)), :mixeddensity, :cor | |||
lags = 0:(maxlag === nothing ? round(Int, 10 * log10(length(range(c)))) : maxlag) | |||
# Chains are already appended in `c` if desired, hence we use `append_chains=false` | |||
ac = autocor(c; sections = nothing, lags = lags, append_chains=false) | |||
ac_mat = convert(Array, ac) | |||
ac_mat = stack(map(stack ∘ Base.Fix2(Iterators.drop, 1), ac)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit inefficient (particular the use of stack(map(stack
) - maybe we should preallocate the desired array and copy directly? Possibly a utility function could be added to PosteriorStats?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may not be the most efficient, but compared to the actual autocor
call, it should not be the computational bottleneck, and not using stack
requires substantially more code.
It might make sense to add to PosteriorStats a function that converts a SummaryStats
to a matrix, but this would effectively do the same thing as the inner stack
. What kind of utility function do you have in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What kind of utility function do you have in mind?
Basically a convert
, Array
/Matrix
constructor, or other function that performs the same thing (if the first two alternatives are not desirable). So that downstream packages (such as MCMCChains) don't have to deal with internals of SummaryStats
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not using any internals of SummaryStats
, whose docstring explains it is an OrderedDict-like Table that can be iterated/indexed by its column names.
I think the methods we're missing in PosteriorStats are
function Base.getindex(stats::SummaryStats, cols::Union{Colon,AbstractVector{Int},AbstractVector{Symbol}})
cols isa Colon && return Tables.matrix(stats)
return stack(Tables.getcolumn(stats, k) for k in cols)
end
Base.firstindex(s::SummaryStats) = 1
Base.lastindex(s::SummaryStats) = length(s)
Then this would be
stack(ac_i[2:end] for ac_i in ac)
but this is no more efficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is not using any internals of SummaryStats, whose docstring explains it is an OrderedDict-like Table that can be iterated/indexed by its column names.
I was referring to the 2:end
which seems a bit special. I missed that the docstring mentions that the first column is reserved for parameter names.
src/plot.jl
Outdated
ordered = false | ||
) | ||
|
||
chain_dic = Dict(zip(quantile(chains)[:,1], quantile(chains)[:,4])) | ||
chain_dic = Dict(zip(quantile(chains)[2], quantile(chains)[5])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a good opportunity to make the code a bit more efficient:
chain_dic = Dict(zip(quantile(chains)[2], quantile(chains)[5])) | |
quantile_chains = quantile(chains) | |
chain_dic = Dict(zip(quantile_chains[2], quantile_chains[5])) |
(it would also be nice if it would be possible to use something more descriptive than 2
and 5
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yeah this and the code immediately below are a bit of a mess. Will rework to use the new methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it was able to be simplified a lot. But also, for n parameters, it was computing the median n^2 times. This has now been fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a few comments!
Am very excited about this though; bloody great stuff @sethaxen ❤️
|
||
Return the highest posterior density interval representing `1-alpha` probability mass. | ||
Return the unimodal highest density interval (HDI) representing `prob` probability mass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the meaning of "unimodal" here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default estimator used by hdi
assumes that the distribution is unimodal. The alternative (not yet in PosteriorStats but doable with HighestDensityRegions.jl) for multimodal distributions first fits a KDE and then partitions into one or more intervals by density. For fast summary statistics, the unimodal version is more useful.
# Compute the change rates. | ||
changerates, mvchangerate = changerate(chains) | ||
|
||
# Summarize the results in a named tuple. | ||
nt = (; zip(names_of_params, changerates)..., multivariate = mvchangerate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Genuine question: what is the change-rate in this context?
Co-authored-by: David Widmann <[email protected]> Co-authored-by: Tor Erlend Fjelde <[email protected]>
…ains.jl into posteriorstats
This is ready for another review. Also, it would be nice to get input on arviz-devs/PosteriorStats.jl#25, since that's relevant for this PR. |
@penelopeysm can you help review this PR, and get it merged? |
I thought arviz-devs/PosteriorStats.jl#25 holds back this PR? |
@devmotion is right, I'd like to merge that breaking change and also some breaking changes to make the After a long break I'm back to working on arviz PRs so I should be able to wrap that up in the next couple of weeks. |
This PR makes the following replacements everywhere:
hpd
->PosteriorStats.hdi
(hpd
retains its old behavior and is now deprecated)summarize
->PosteriorStats.summarize
ChainDataFrame
->PosteriorStats.SummaryStats
The replacement of
ChainDataFrame
and the slight change in API and behavior of the methods makes this a breaking change.Implements #430
e.g.