-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter tuning for RDD / lm_forest #1259
Comments
Hi @corydeburd, That MSE is a reasonable object to consider. Another approach is to treat RDD Forest purely as a data-driven algorithm to find heterogeneous subgroups, i.e: Split data into training/evaluation
Doing many runs of 1) with different tuning parameters is fair in the sense that inference should still be valid for the RDD coefficients you estimate in 2) since it is a tuned algorithm that discovers the subgroups on a held out data set. |
Thanks, this is a great suggestion. Actually, I had adopted something like this approach, so it's good to know it comes recommended! Holding out is very important as I think it's very easy to overfit in my situation [it's an RD, so observations near the cutoff have a lot of weight] |
Dear @erikcs and @corydeburd , This thread is very helpful. Thank you.
|
Hi @yusukematsuyama, there's no fixed rule for the train/test, 50/50 and 70/30 are just some common choices. Forests are usually robust wrt tuning parameters, it's hard to say which range of parameters is reasonable. |
Dear @erikcs, Thank you for your advice. I will try that! |
I wanted to check whether the solution for parameter tuning proposed here (#1195) would be valid for the regression discontinuity case, paired with lm_forest() as in the example below. As with the previous link, this method does not currently have a setting to automatically tune parameters.
https://grf-labs.github.io/grf/reference/lm_forest.html
Does the code / intuition in the original post still apply here? That is, with lm_forest() and non-binary "treatments" (i.e., the RD running variable slopes), is this MSE still the object to consider?
The text was updated successfully, but these errors were encountered: