DiscreteDistribution point masses to be a private attribute #623

sbenthall · 2020-04-14T15:06:36Z

X is not a good attribute name.

Better to use something more reflective of the mathematical term

The text was updated successfully, but these errors were encountered:

sbenthall · 2020-04-23T13:36:27Z

This field is currently named X because that was the parameter name in the old function from which this code was derived.

I think improving this may involve untangling and revising more of the HARK code.

For comparison, it looks like neither Mathematica nor PyMC3 expose the point values of a discrete distribution. It's possible that with better designed code, this is not so necessary:
https://reference.wolfram.com/language/ref/DiscreteUniformDistribution.html
https://docs.pymc.io/api/distributions/discrete.html

If I'm reading things right, in the original HARK design, combineIndepDstns was used to bundle the transitory and permanent income shocks into one list-of-arrays--for convenience, I guess.

The design here is a bit baffling because in ConsIndShockModel, which I suppose is paradigmatic, the IncomeDstn contains all the information about the transitory and permanent income shocks; but these are independent of each other. But the method also returns the PermShkDstn and TransShkDstn separately. From the looks of it, this is entirely redundant while making the code more complex.

I'd love to simplify things down into a cleaner design and given the comments on #538 it seems like this is an area that hasn't been given much thought since it was originally written and a refactor is welcome, in principle.

It doesn't look like it's possible to do this cleanly with a dedicated, generic distribution object like what in Mathematica is a LogMultiNormal distribution:
https://reference.wolfram.com/language/ref/LogMultinormalDistribution.html

This would be inappropriate for two reasons:

the point of a LogMultinormal distribution is to deal with correlations between the multivariate outputs; here the distributions are independent by construction
the underlying shock distributions are not truly LogNormal because they have an additional point value added for unemployment

Later, when the shock distributions are actually sampled, it looks like the IncomeDstn is used, so it's not clear why the PermShkDstn and TransShkDstn objects were created and stored separately earlier...

HARK/HARK/ConsumptionSaving/ConsIndShockModel.py

Lines 2160 to 2167 in 8073a17

    
           IncomeDstnNow    = self.IncomeDstn[t-1] # set current income distribution 
        
           PermGroFacNow    = self.PermGroFac[t-1] # and permanent growth factor 
        
           # Get random draws of income shocks from the discrete distribution 
        
           EventDraws       = IncomeDstnNow.draw_events(N, 
        
                                                        seed=self.RNG.randint(0,2**31-1)) 
        
           PermShkNow[these] = IncomeDstnNow.X[0][EventDraws]*PermGroFacNow # permanent "shock" includes expected growth 
        
           TranShkNow[these] = IncomeDstnNow.X[1][EventDraws]

Long story short: the X attribute here seems mainly to be an artifact of a bad code style. There's no reason for IncomeDstn to be stored as a multivariate distribution because its component distributions are independent. We can simplify the Distribution code by restricting the interface to creating the distribution and sampling from it.

This change will ripple through the codebase and to downstream code, but seems desirable.

sbenthall · 2020-04-23T16:46:47Z

Notes from today's meeting:

The idea is to have flexibility for a multivariate distribution where the variables are not independent.
This is used elsewhere in HARK.

Ideally, there should be a multivariate distribution object from which it is possible to extract the marginal distributions, with various options for methods of discretization and approximation.

sbenthall · 2020-06-25T16:11:05Z

#730

sbenthall · 2021-01-13T14:44:17Z

Reopening this. Dolo IID Process integration is for a later milestone now #611
This can be solved easily by the 1.0 release, and is implicated directly in the now manageable solution code simplications #625

sbenthall · 2021-01-19T20:50:49Z

This issue is awkwardly named; I see now that this simple "name change" implicates several other issues, including #120 and #625.

It may require further review to determine the right short term solution.

One may be: through calcExpectations (#625), there should be no reason for the point masses to be a public variable.

llorracc · 2021-01-20T04:08:49Z

The way to think about this is to think about the discretized distributions generically as approximations to idealized continuous distributions. So, for example, we might want to say that the setup is one in which the "truth" is that the permanent shocks to idiosyncratic permanent income and to aggregate stock returns are jointly lognormally distributed with a covariance coefficient of 0.1. But in practice any numerical solution will have to approximate that with some particular discretization of the relationship, which is a matrix. Among competing discretizations, we want to have an unambiguous standard of "which one is closer to being right" -- in case the models differ in some important prediction. We want the unambiguous answer to be "whichever one is closer to the answer you get as you increase the fidelity of the discrete approximation to the continuous assumption." A 1000x1000 approximation is going to give a closer approximation than a 3x3 approximation. So whichever discretization gives an answer closer to the more accurate approximation is right.

So, it is not worthwhile to devote any effort to trying to efficiently meld the unemployment shock with the transitory shock. The interesting questions will all have to do with either (a) correlations between the transitory and permanent shocks; or (b) correlations between the unemployment shock and either transitory or permanent shocks (conditional on not becoming unemployed).

PS. I'd really, really like to get into HARK 1.0 the default of our distributions being ones with point masses at the min and max values of some standard deviation, per the discussion elsewhere. Matt and I strongly agree on this subject, for deep and powerful reasons. We both wish I'd done it a long time ago (like, 20 years ago). It's clearly the right way to do things.

sbenthall · 2021-01-20T13:11:29Z

PS. I'd really, really like to get into HARK 1.0 the default of our distributions being ones with point masses at the min and max values of some standard deviation, per the discussion elsewhere. Matt and I strongly agree on this subject, for deep and powerful reasons. We both wish I'd done it a long time ago (like, 20 years ago). It's clearly the right way to do things.

Is there an existing issue for this? If not, could you make one describing what you mean?

sbenthall · 2021-01-20T13:13:40Z

The way to think about this is to think about the discretized distributions generically as approximations to idealized continuous distributions.

I suppose this will be fixed as part of #640 then.

Looks like this ticket can be closed as its scope is covered in better detail in other issues.

sbenthall added this to the 1.0.0 milestone Apr 14, 2020

sbenthall mentioned this issue Apr 22, 2020

ConsKinkyPrefSolver doesn't need PrefShkPrbs #641

Closed

sbenthall added the Function: Distributions label Apr 24, 2020

sbenthall closed this as completed Jun 25, 2020

sbenthall reopened this Jan 13, 2021

sbenthall self-assigned this Jan 13, 2021

sbenthall changed the title ~~DiscreteDistribution X is not a good name for the attribute~~ DiscreteDistribution point masses to be a private attribute Jan 19, 2021

sbenthall mentioned this issue Jan 19, 2021

New functionality: correlated discretized RVs #120

Closed

sbenthall closed this as completed Jan 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DiscreteDistribution point masses to be a private attribute #623

DiscreteDistribution point masses to be a private attribute #623

sbenthall commented Apr 14, 2020

sbenthall commented Apr 23, 2020

sbenthall commented Apr 23, 2020

sbenthall commented Jun 25, 2020

sbenthall commented Jan 13, 2021

sbenthall commented Jan 19, 2021

llorracc commented Jan 20, 2021

sbenthall commented Jan 20, 2021

sbenthall commented Jan 20, 2021

DiscreteDistribution point masses to be a private attribute #623

DiscreteDistribution point masses to be a private attribute #623

Comments

sbenthall commented Apr 14, 2020

sbenthall commented Apr 23, 2020

sbenthall commented Apr 23, 2020

sbenthall commented Jun 25, 2020

sbenthall commented Jan 13, 2021

sbenthall commented Jan 19, 2021

llorracc commented Jan 20, 2021

sbenthall commented Jan 20, 2021

sbenthall commented Jan 20, 2021