Feature importance #3862

ImranA10 · 2024-01-06T17:36:43Z

ImranA10
Jan 6, 2024

I've currently trained 2 versions of a model, one lighter version with only input 2 column and then another with all the columns available in my dataset, I've seen that the lighter model performed better (based on mse) so I'd like to understand feature importance from model, is there a way to extract/calculate this?

Answered by arnavgarg1

Jan 29, 2024

Hi @ImranA10, sorry for the late response. We do support explanations, unfortunately however, this hasn't been documented in Ludwig docs.

The best starting point is this part of the repository: https://github.com/ludwig-ai/ludwig/tree/master/ludwig/explain

Depending on your model, you can use IntegratedGradients (for ECD architecture) or the GBM explainer for tree-based models.

For The IntegratedGradients explainer, you can do something like this:

from ludwig.explain.captum import IntegratedGradientsExplainer

# model: trained LudwigModel
# inputs_df: pd.DataFrame
# sample_df: pd.DataFrame
# target: str
explainer = IntegratedGradientsExplainer(model, inputs_df, sample_df, target)
explanat…

View full answer

arnavgarg1 · 2024-01-29T09:40:19Z

arnavgarg1
Jan 29, 2024
Collaborator

Hi @ImranA10, sorry for the late response. We do support explanations, unfortunately however, this hasn't been documented in Ludwig docs.

The best starting point is this part of the repository: https://github.com/ludwig-ai/ludwig/tree/master/ludwig/explain

Depending on your model, you can use IntegratedGradients (for ECD architecture) or the GBM explainer for tree-based models.

For The IntegratedGradients explainer, you can do something like this:

from ludwig.explain.captum import IntegratedGradientsExplainer

# model: trained LudwigModel
# inputs_df: pd.DataFrame
# sample_df: pd.DataFrame
# target: str
explainer = IntegratedGradientsExplainer(model, inputs_df, sample_df, target)
explanation_results = explainer.explain()

Note that integrated gradients is a pretty memory heavy process so it may take some time to run and you may see out of memory errors, just depends on how much compute you have on the machine you intend to run explanations on.

Hope this helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature importance #3862

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Feature importance #3862

ImranA10 Jan 6, 2024

Replies: 1 comment

arnavgarg1 Jan 29, 2024 Collaborator

ImranA10
Jan 6, 2024

arnavgarg1
Jan 29, 2024
Collaborator