You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhancing the Flexibility of Linear Models in Leaf Nodes of Boosted Linear Trees
Motivation
Linear trees represent a practical technique that not only enhances model performance and simplifies model structure but also improves model interpretability. When working with linear models, users often need to impose numerous custom constraints to enhance interpretability and incorporate additional prior knowledge. These constraints may include restricting all regression coefficients to be positive, defining the monotonicity of each variable, and limiting the linear regression to a subset of selected features.
Description
As a regular user of this library, I am deeply grateful for the diligent efforts of all developers and maintainers, whose hard work has greatly facilitated our work.
Upon a thorough review of the documentation and the linear_tree_learner.cpp code (link: https://github.com/microsoft/LightGBM/blob/master/src/treelearner/linear_tree_learner.cpp), I have observed that, apart from the ridge regression parameters, the linear model component lacks support for other features, such as the aforementioned constraints on the signs of regression coefficients and the capability to include only a subset of features in the linear regression.
References
It is proposed that the functionality extensions of linear models in sklearn could be referenced, or an interface could be provided to enable users to customize linear models, thereby enhancing the flexibility and practicality of linear tree models.
The text was updated successfully, but these errors were encountered:
Related to this, I think adding the option to include some predictors in all linear models, in addition to the predictors used in the splits to reach the leaf, is important.
I have datasets containing data from several population segments, and I am not interested in including the variables that define the segments in the model itself. However, I would like to include an adjustment in the prediction using the segment flags in the linear model fitted to each leaf.
My leaves have more than 20K observations, so including this segment adjustment does not pose an overfitting problem.
This option could be set through a parameter, 'features_forced_to_leaf_linear_model', as an array of feature indices or feature names.
I think this wouldn't be complex to implement, but I don't have the necessary C++ skills to do it.
Summary
Enhancing the Flexibility of Linear Models in Leaf Nodes of Boosted Linear Trees
Motivation
Linear trees represent a practical technique that not only enhances model performance and simplifies model structure but also improves model interpretability. When working with linear models, users often need to impose numerous custom constraints to enhance interpretability and incorporate additional prior knowledge. These constraints may include restricting all regression coefficients to be positive, defining the monotonicity of each variable, and limiting the linear regression to a subset of selected features.
Description
As a regular user of this library, I am deeply grateful for the diligent efforts of all developers and maintainers, whose hard work has greatly facilitated our work.
Upon a thorough review of the documentation and the linear_tree_learner.cpp code (link: https://github.com/microsoft/LightGBM/blob/master/src/treelearner/linear_tree_learner.cpp), I have observed that, apart from the ridge regression parameters, the linear model component lacks support for other features, such as the aforementioned constraints on the signs of regression coefficients and the capability to include only a subset of features in the linear regression.
References
It is proposed that the functionality extensions of linear models in sklearn could be referenced, or an interface could be provided to enable users to customize linear models, thereby enhancing the flexibility and practicality of linear tree models.
The text was updated successfully, but these errors were encountered: