-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mean et al. for truncated log normal #1874
base: master
Are you sure you want to change the base?
Conversation
My worry (that was also expressed in issues such as #968) is that generally numerical integration is challenging and a fallback might lead to silently incorrect results. It seems such a fallback would be wrong (or at least problematic) e.g. if the moments are not finite (such as e.g. for So my general feeling is that numerical integration should maybe be restricted to a smaller subset of distributions, or maybe even only be available as a separate function. In case we want to use it more broadly, I think it would also be safer to error if the integration error estimate is too large, to reduce the probability of silently incorrect results. |
@devmotion Is there any numerical integration in this code? |
@devmotion, perhaps you're thinking of #1875? As @PaulSoderlind noted, there's no integration here. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1874 +/- ##
==========================================
- Coverage 85.99% 85.71% -0.29%
==========================================
Files 144 145 +1
Lines 8666 8706 +40
==========================================
+ Hits 7452 7462 +10
- Misses 1214 1244 +30 ☔ View full report in Codecov by Sentry. |
Oh yes, indeed, it seems I commented on the wrong PR. |
CI failures on nightly are unrelated. |
src/truncated/lognormal.jl
Outdated
a = d.lower === nothing ? nothing : log(T(minimum(d))) | ||
b = d.upper === nothing ? nothing : log(T(maximum(d))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to use minimum
/maximum
instead of d.lower
/d.upper
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like maybe I had a reason but who knows what it was
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I remember now: it was to handle truncation points outside of the support, e.g. truncated(LogNormal(); lower=-1)
. In that case, d.lower == -1
but minimum(d) == 0
. I should add a test for that.
src/truncated/normal.jl
Outdated
@@ -118,6 +118,36 @@ function entropy(d::Truncated{<:Normal{<:Real},Continuous}) | |||
0.5 * (log2π + 1.) + log(σ * z) + (aφa - bφb) / (2.0 * z) | |||
end | |||
|
|||
function mgf(d::Truncated{<:Normal{<:Real},Continuous}, t::Real) | |||
T = promote_type(partype(d), typeof(t)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess
T = promote_type(partype(d), typeof(t)) | |
T = float(promote_type(partype(d), typeof(t))) |
would be more correct.
src/truncated/lognormal.jl
Outdated
return @horner(m1, m4, -4m3, 6m2, 0, -3) / v^2 - 3 | ||
end | ||
|
||
median(d::Truncated{<:LogNormal}) = exp(median(_truncnorm(d))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This suggests we should add a similar definition for quantile
as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh good call, it turns out that this relation holds in general, not just for the median.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, defining median
/quantile
is unnecessary as there's already a fallback definition for computing quantiles for truncated distributions that would be more efficient than the additional layer of indirection involved with constructing a truncated normal.
function var(d::Truncated{<:LogNormal}) | ||
tn = _truncnorm(d) | ||
# Ensure the variance doesn't end up negative, which can occur due to numerical issues | ||
return max(mgf(tn, 2) - mgf(tn, 1)^2, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeated evaluation of mgf
involves repeated calculations of the same (intermediate) quantities. But my fear is that optimizing this further will lead to less readable code...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I had the same thought and wasn't sure what to do about it so I just... didn't address it, haha
9d417c7
to
036a24d
Compare
Ugh, type inference issue on 1.3 and I can't test on 1.3 locally 😑 |
a = d.lower === nothing || d.lower <= 0 ? nothing : log(T(d.lower)) | ||
b = d.upper === nothing || isinf(d.upper) ? nothing : log(T(d.upper)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests on Julia 1.3 pass locally when changing this to
a = d.lower === nothing || d.lower <= 0 ? nothing : log(T(d.lower)) | |
b = d.upper === nothing || isinf(d.upper) ? nothing : log(T(d.upper)) | |
a = d.lower === nothing ? nothing : log(T(max(d.lower, 0))) | |
b = d.upper === nothing ? nothing : log(T(d.upper)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That isn't functionally equivalent though; IIRC, I was relying on a
and/or b
being nothing
in those cases so that truncated
would handle it a particular way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But arguably this optimization (d.lower === nothing
or d.upper === nothing
) should already have happened, either by a user or internally, when constructing the d = truncated(LogNormal(...))
? Maybe one shouldn't expect that unoptimized inputs lead to optimized algorithms. AFAICT the optimization of truncated(Normal(...), ...)
is also only exploited in the case where this returns a Normal
(a = b = nothing
); the code for Truncated{<:Normal}
does not seem to use the fact that a bound might be nothing
. In the Normal
case, arguably the LogNormal
shouldn't be truncated in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently sick and can only barely comprehend that message but if you'd prefer to go with your suggested change then feel free to apply it, I trust your judgement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did we have this conversation a few months ago? My brain similarly can't comprehend whether this is the same discussion: #1874 (comment)
I should probably just log off and go sleep
This PR adds
mgf
for truncated normal and uses that to implementmean
,var
,skewness
, andkurtosis
for truncated log normal based on this observation.median
is also implemented for truncated log normal.Fixes #709