This repository contains the code for Robust Self-Augmentation for Named Entity Recognition with Meta Reweighting (NAACL2022).
- Python >= 3.6
- Torch >= 1.3
- transformers >= 4.0
- higher
- Core Thought: the complex calculation of higher-order gradients is simplified to a first-order approximation (e.g., to do the first-order Taylor expansion)
- Get partial training set:
python processing/sample.py 0.05|0.1|0.3
- Build the entity dictionary:
python processing/build_ner_dic.py train_data_file ent.dic cn|en
- Obtain the word-to-vectors trained on Wikipedia
- Produce pseudo-labeled training set:
python processing/cn|en_aug_util.py train_data_file aug_train_data_file ent.dic ratio aug_times
Note: The data format is BIOES CoNLL. The processing/conll_util.py
script provides the format transformation.
- Learning to Reweight Examples for Robust Deep Learning
- Meta-Weight-Net: Learning an Explicit Mapping For Sample Weighting
- Distilling Effective Supervision from Severe Label Noise
- Meta Soft Label Generation for Noisy Labels
- Meta Label Correction for Noisy Label Learning
- Semi-Supervised Learning with Meta-Gradient