Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More general grids for decision rules #188

Open
albop opened this issue Mar 3, 2020 · 22 comments
Open

More general grids for decision rules #188

albop opened this issue Mar 3, 2020 · 22 comments

Comments

@albop
Copy link
Member

albop commented Mar 3, 2020

This issue stems from two discussions:

  • with @llorracc-git about the need to have d.r. as a function of continuous and/or discrete exogenous states and continuous and/or discrete endogenous states
  • with @sbenthall who was arguing that the distinction between endogenous and exogenous grid is somewhat artificial.
    It is also related to the new implementation of d.r. in dolark which I just moved to dolo.

Long term proposal would be to define a d.r. object as a function of a single grid with an exogenous and an endogenous component. The basic case would be a Cartesian product of two grids and would correspond to what we currently do but it could then be extended to non Cartesian products ( for instance ranges of endogenous vars as a function of exogenous values) or other products. This is non problematic I think.
To accomodate the discrete vs continuous variable the idea would be to introduce a 'locator' function that would be defined by the grid and that we would treat in the same way whether it's discrete it not. That part is problematic and needs to be thought of in conjunction with potential discrete choices to be dealt with I the future.
This is a long term issue. In the short term my proposal would be to make sure we use endo_grid and exo_grid everywhe instead of grid which will eventually denote the product of them. One question is what to do with the grid option in the yaml file : it refers to the endogenous grid. It's not a big deal since it's not an actual section, like domain.

@llorracc
Copy link

llorracc commented Mar 9, 2020

@sbenthall is right about the same variable being a state vs a control during different stages. For example, we solve the joint saving-and-portfolio-share model by constructing a grid of possible values of assets with which you might end the period, calculating the optimal portfolio share for a person ending the period with that $a$, and then constructing marginal utility conditional on optimal portfolio choice, which yields optimized marginal value for ending the period with that $a$. From the perspective of the beginning of the period, however, the consumer's problem can be written as choosing the optimal $a$. So, it's a control at the first stage of the problem within the period, and a state at the next stage.

At present in HARK we solve this problem with hand-crafted tools. In the portfolio share tool, $a$ is hard-wired as a state; but the consumption problem (which knows nothing about the portfolio choice tool except its output, $v'(a)$).

So, your proposal is to find all of the variables that are either states or controls at any stage of the problem and make a grid that incorporates all of them.

I'm trying to understand what you mean by a "locator" variable by thinking of an example. So, if one of the exogenous variables was your employment state, we might want to retrieve an object that would tell us the range of values of consumption that are feasible for unemployed people, and a different range for employed people? This would be an intermediate product in the multistage problem, and could be fed back to an earlier stage that would conduct its search within the appropriate range of values contingent on state?

On the last point, about using endo_grid and exo_grid (I'd really prefer exog_grid so they are the same length) everywhere, do you mean that we would allow for a different endo_grid and exog_grid at each "stage" of the solution? Otherwise I don't see how this deals with the problem that at some stages a variable might be a state and at other stages a control.

@albop
Copy link
Member Author

albop commented Mar 9, 2020 via email

@sbenthall
Copy link
Contributor

The way I think about it is this:

  • The dependency relationships between, for each time step, the state variables, the control variables, and the exogenous shocks variables can be arranged as a directed acyclic graph. What we call an "exogenous shock" is a random variable with no parents in this graph.
  • The graphical structure can be used, in a general way, to determine the most efficient solution to the problem. This is essential backwards induction but applied to a DAG.
  • It is better to determine the programming interface to the decision rule separately from how it is implemented.

@llorracc
Copy link

llorracc commented Mar 9, 2020

It is better to determine the programming interface to the decision rule separately from how it is implemented.

I don't think the programming interface can be separated from careful thinking about implementation, because it is not useful to have a programming interface that you later realize is impossible (or unnecessarily difficult) to implement, but "if only we had thought about implementation, we could have achieved the goal just as well with a different syntax but one that can be implemented straightforwardly.) But "careful thinking" about implementation is not the same as implementation itself. These discussions seem to me to be at exactly the right level: We are thinking about fundamental questions about how the models will be represented, and then thinking about the grammar to efficiently capture that.

My idea is that, as we work on a grammar for these kinds of models, we should be doing "mock implementations" of various models in HARK that cannot run yet, but where writing the models in the grammar tests whether the grammar is actually capable of expressing all the things we need it to.

@sbenthall
Copy link
Contributor

A perhaps complementary approach would be to create some examples of ideal input/output pairs for the system you are trying to build, bracketing off the question of implementation.

This would allow you to express concrete, difficult requirements you might have for the software.

@sbenthall
Copy link
Contributor

Just to clarify what I mean by the implicit DAG structure in the dependency relations between the model variables, I've added graphical representations to this notebook where I'm representing consumption problems in MDP form:

https://github.com/sbenthall/sketches/blob/master/economics/PortfolioConsumptionMath.ipynb

Squares are control variables. Diamond is 'reward' for each period. I'm using 'prime' not subscripting 't+1'.

I believe that:

  • If this dependency structure were explicit in the code, it would be possible to write a universal solver that captures many of the problem-specific insights of Carroll and others that benefit from decomposing a timestep into multiple stages.
  • I think this also helps clarify the problem definition.

For example, note the section in that notebook on Portfolio Choice.

In it, there is an exogenous shock eta representing the period's return on the risky asset.

An important question for the model, which may not be obvious in a different representation, is whether the consumer knows eta at the time when they pick their investment ratio alpha. It seems that in the spirit of the model, they must not (otherwise, it's not really a risky asset).

@albop
Copy link
Member Author

albop commented Mar 10, 2020

I'm afraid this thread has drifted a bit away from the semi-simple idea I was trying to describe. There are two conceptually distinct problems:

  • how to represent concretely a solved decision rule.
    • This is essentially a function of a given space. In the current paradigm, the representation is made of 1/ a discrete set of points from a multidimensional space (aka grid), and 2/ the values on these points.
    • For algorithmic considerations, the grid is currently a cartesian product of two grids one of them representing a completely exogenous process, and the other representing states. There has been some ongoing work to allow for more generality in the grids that are accepted (like irregular grids).
      • In particular, to implement the endogenous grid point in 1d the current objects are sufficient: in each iteration define a new d.r. with a new grid on a. (I guess one could implement a resampling function to keep the same object, but that is another matter).
    • This proposal is about allowing for more generality by allowing for more complex structures with possibly non-cartesian products or more than two grids. It doesn't say anything about the solution process or the way these grids are defined.
  • how to solve for a decision rule :
    • In a given iterative algorithmic problem there might be a way to decompose each step (each update at time t) into a series of successive choices (more generally a DAG). This could be by design (think seasonal component) or derive from a model property (like the post-state for wealth in the portfolio example).
    • There are several approaches here to devise this solution strategy
      • analyze a graph of variable incidences in the equations to construct a solution strategy (possibly in the way the calibration block is solved recursively). We must probably exclude any equation rewrite (would probably lead to an NP problem)
      • provide some hints to the solver (say: solve for a in equation x, then use the result to solve for b in equation y).
      • change the language so that timing of choices is part of the model (this also has drawback of making it harder to compare solution methods).
    • Once the solution strategy is devised one still needs to adapt the solution algorithm intelligently.
      The closest I've come to the second issue was to think about ways to decompose the exogenous process (in the simplest case: solve last date then backwards).

@sbenthall
Copy link
Contributor

I see.

What I am saying is that the interface to a concretely solved decision rule could be more general even than its implementation, which uses grids.

That interface could be selected based on what traffics well between well-defined use cases.

In any case, I believe:
(a) the solver (of any of the types listed) must output a decision rule
(b) the decision rule and the original model can be used to simulate the system

At this level of abstraction, I am not seeing any constraint specifying that the decision rule needs to be implemented as a set of grids, cartesian or not. But maybe I'm missing something.

@albop
Copy link
Member Author

albop commented Mar 10, 2020

Sticking to the decision rule / grid question, I've got two more comments:

  • maybe a domain object should be attached to the d.r., distinct from the grid. This would fit cases where one regresses a d.r. instead of fitting it on a grid. In practical terms it wouldn't be immediately useful right now (some d.r. objects cat do .fit_values (instead of .set_values), and the Julia version actually has a concept of random grid, but it would bring some mathematical closure. Another way to look at it is that currently the domain is essentially R^n. Changing it would probably require an update to the dolo model specification so it would be post 0.5.
  • about the "locator" I mentioned, current decision rules support some form of it. If you do dr(x, y) where x (resp y) is a continuous value for the exogenous (resp endogenous) state you will trigger a smooth interpolation method for the two arguments. On the other hand if you do dr(i,x) where i is an integer, then you will be using the i-th value of the exogenous grid and interpolate only w.r.t. y.
    • Sometimes both of them make sense (like for a cartesian exogenous grid), sometimes only one of them is well defined (currently). This is a joint property of the grids and the locator. In the new proposal, both would be treated symmetrically so that dr(i,j) would be allowed if the natural representation of the state is discrete. But dr is defined on a product of smolyak states, then dr(i,j) doesn't make sense, but we can define dr(i)
    • When you have a product of more than two grids, you could have dr(x,y,z) or dr(x,i,y) with a predictable behaviour based on type dispatch.

@sbenthall
Copy link
Contributor

What you are describing:

  • an object
  • perhaps with a domain
  • with a grid interpolation-based method for resolving "inputs" into an "output"
  • with methods for inspecting the grid values used in the interpolation

sounds like a good way to implement a mathematical function that is an interpolation over multiple dimensions of grid points.

I think it would make the most sense to implement that functionality in the most general possible way, without complicating it with the specific semantics of being "a decision rule", or having its inputs be either endogenous or exogenous.

This implementation of a function could then be used to implement a "decision rule", which is an object with additional semantics: it is connected to a control variable in a model; its inputs are some subset of the other variables in the model; it can be a variable in the model which other variables depend on; it is the output of a solver.

@albop
Copy link
Member Author

albop commented Mar 10, 2020

Oh, I completely agree with the comment that the interface to a concretely solved decision rule could be more general even than its implementation, which uses grids. Our comments have crossed: I discussed above d.r. without grids.
There is one more element missing: discrete choices. And maybe mixed strategy. But I'm leaving them for for later since they are not included in the current dolo model specification. On the other hand, more general grids could be added incrementally.

@sbenthall
Copy link
Contributor

I wonder if this interpolated-function object would ever be used in a situation where it is not a control variable.

For example, supposed one wanted to fit an exogenous distribution over an irregular empirical data set.

@albop
Copy link
Member Author

albop commented Mar 10, 2020

Well by definition, a decision rule, is a policy function, is a function (is a rose is a rose) so I"m not going to argue. Renaming it Function with class DecisionRule(Function): pass wouldn't bother me at all.
We're using a d.r. object for value function for instance. (yes, parametrizing a distribution would be a good example too)
There is indeed one element of semantic, I would like to add to grids and decision rules which would be variable naming. So you could know by inspecting the d.r. object that it's output is c. This is still very light meta-data, fairly neutral in terms of computational cost, but would probably be quite useful. Spirit would be similar to R coding where you carry variable names along (or xarray).

@sbenthall
Copy link
Contributor

We're using a d.r. object for value function for instance.

!!!! This reads like a mistake to me. Maybe I should make an issue for it.

I would like to add to grids and decision rules which would be variable naming.

That does sound very nice.

I think there's a subtle question about when and how Python names and the names of model variables are used.

I imagine R and Julia have good solutions to this?

One thing I do not like about HARK is that all model variables are hard-coded into the objects. It is one thing that is going to make integration with Dolo difficult. I am trying to figure out a way to gracefully move HARK away from that.

@albop
Copy link
Member Author

albop commented Mar 10, 2020

Actually, after reflecting about it for a while, I'm 65% of sold on the medium run usefulness of a separate "function" library (unless it is exactly the same as the interpolation library). At any rate there are significant design issues to be solved so I don't want to address it right now. (Even ignoring the fact that economist will probably want to keep special jargon like states/controls instead of input/input.)
In the short run, Im pretty keen about renaming the class Function, unless there is a better name.

@albop
Copy link
Member Author

albop commented Mar 10, 2020

I don't get the subtle question about variable naming. Nothing would be hardcoded.
To define a d.r. (fun.) Object you would just do 'DecisionRule(grid_exo, grid_endo, names=controls)' where 'controls ' would be a tuple of strings (for instance the same ones defined in 'symbols: controls'). Names of the inputs could also be specified although they would be implicitly be if they are attached to the grids.

@sbenthall
Copy link
Contributor

I'm very glad that we're converging on design ideas!

I'm lukewarm on "Function" as the name of a class, because...well, it's so overloaded in programming anyway.

I see what you mean about having a 'names' section--yes, that's sensible.

I spent the afternoon coding this up as a mockup 'design' that sketches out a bit what I've been thinking. Maybe it will clarify some of what I've been getting at?

https://github.com/sbenthall/sketches/blob/master/economics/MDP%20Interfaces.ipynb

@albop
Copy link
Member Author

albop commented Mar 10, 2020 via email

@llorracc
Copy link

llorracc commented Mar 10, 2020 via email

@sbenthall
Copy link
Contributor

I'm afraid we may be speaking past each other.
It was truly not clear what I was trying to communicate with that code, and that is my fault.

For the purpose of this discussion, I was trying to demonstrate that the representation of a mathematical function (done simply with lambdas, in that case) can be done separately from the representation of the semantics of the function's relationship to a variable (i.e., as a transition function for a state, as a decision rule as applied to a choice variable, as a value function, etc.)

@sbenthall
Copy link
Contributor

sbenthall commented Mar 11, 2020

One more thought on this topic....

A discussion came up the other day in HARK about the representation of exogenous shock distributions. I see now that this was the conversation that was cut short by networking issues...
econ-ark/HARK#519

While considering class structures and interoperability of different implementations, I wonder if it's worth considering if exogenous distributions might be represented in ways besides a grid.

If this is all quite peripheral or a distraction, I apologize; of course, I'm not as embedded in the core roadmap of the Dolo project.

@albop
Copy link
Member Author

albop commented Mar 11, 2020

This is actually part of the design. When you put an AR1 in a model, model.exogenous returns an AR1 object, that represents the actual stochastic process. Then model.exogenous.discretize() gives you a discretized object, which includes a grid and integration weights.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants