Skip to content

Latest commit

 

History

History
24 lines (16 loc) · 1.43 KB

readme.md

File metadata and controls

24 lines (16 loc) · 1.43 KB

Graph based dependency parsing

This repository replicates research by (Kiperwasser and Golberg, 2016). Datasets are taken from the Universal dependencies project. Initial word embeddings were obtained with the GloVe algorithm and can be found in the embeddings subfolder of each language folder.

Usage

Models are trained with the NLP_training.py script. It takes the following arguments: hidden size for the arc-prediction MLP, learning rate, hidden size for the label-prediction MLP, language (must correspond to a language directory, 'en' and 'de' are currently available), number of epochs and modelname (a model will be saved after each epoch).

python NLP_training.py 25 0.0001 50 de 2 model1

Evaluation is done with the test.py script. First four parameters are equivalent to the training script (although learning rate won't be used), and aditionally the dataset that shall be used for evaluation (train, dev or test) and modelname.

python NLP_training.py 25 0.0001 50 de test model1_e16

The script will print UAS and LAS. It will also write to a file named conllu which can be evaluated with the official evaluation script.

Some pretrained models achieving UAS scores around 70 % can be found in the models folder.