Skip to content

Latest commit

 

History

History
17 lines (14 loc) · 866 Bytes

README.md

File metadata and controls

17 lines (14 loc) · 866 Bytes

BERT interpretation with RF

This repository belongs to the thesis: An approach for interpreting the BERT for sequence classification model with the use of Random Forest By Angela Puc

Here you can find the google colab notebooks used for thesis. The order of the files is the following:

  1. preprocessing-for-bert-yelp-dataset
  2. categories-yelp-dataset (categories_Yelp_KMeans is a experiment of this step)
  3. BERT_training (10epochs-training_BERT and BERT_epochs_analysis are complementary notebooks of this step)
  4. BERT_yelp_evaluation
  5. RF_input_creation
  6. RF_mimic_BERT_grid
  7. BERT_features_analysis (analysis_subsamples and feature_contributions_analysis are complementary notebooks of this step)

The obtained data to perform the pertinent analysis can be found here: https://drive.google.com/drive/folders/1BRTsgsweYI646d3bunB3DbLyVt-q4tpb?usp=sharing