Feature importance #3862
-
I've currently trained 2 versions of a model, one lighter version with only input 2 column and then another with all the columns available in my dataset, I've seen that the lighter model performed better (based on mse) so I'd like to understand feature importance from model, is there a way to extract/calculate this? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi @ImranA10, sorry for the late response. We do support explanations, unfortunately however, this hasn't been documented in Ludwig docs. The best starting point is this part of the repository: https://github.com/ludwig-ai/ludwig/tree/master/ludwig/explain Depending on your model, you can use IntegratedGradients (for ECD architecture) or the GBM explainer for tree-based models. For The IntegratedGradients explainer, you can do something like this: from ludwig.explain.captum import IntegratedGradientsExplainer
# model: trained LudwigModel
# inputs_df: pd.DataFrame
# sample_df: pd.DataFrame
# target: str
explainer = IntegratedGradientsExplainer(model, inputs_df, sample_df, target)
explanation_results = explainer.explain() Note that integrated gradients is a pretty memory heavy process so it may take some time to run and you may see out of memory errors, just depends on how much compute you have on the machine you intend to run explanations on. Hope this helps! |
Beta Was this translation helpful? Give feedback.
Hi @ImranA10, sorry for the late response. We do support explanations, unfortunately however, this hasn't been documented in Ludwig docs.
The best starting point is this part of the repository: https://github.com/ludwig-ai/ludwig/tree/master/ludwig/explain
Depending on your model, you can use IntegratedGradients (for ECD architecture) or the GBM explainer for tree-based models.
For The IntegratedGradients explainer, you can do something like this: