This repository contains code and models for replicating results from the following publication:
- Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction(EMNLP, 2018)
- Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi
- In EMNLP 2018
Part of the codebase is extended from lsgn and e2e-coref.
- Python 2.7
- TensorFlow 1.8.0
- pyhocon (for parsing the configurations)
- tensorflow_hub (for loading ELMo)
-
Python 2.7
-
TensorFlow 1.8.0
-
pyhocon (for parsing the configurations)
-
tensorflow_hub (for ELMo)
-
GloVe embeddings and downloading data:
./scripts/fetch_required_data.sh
-
Build kernels:
./scripts/build_custom_kernels.sh
(Please make adjustments to the script based on your OS/gcc version)
- Some of our models are trained with the ELMo embeddings. We use the ELMo model loaded by tensorflow_hub.
- It is recommended to cache ELMo embeddings for training and validating efficiency. Please modify the corresponding filenames and run
python generate_elmo.py
to generate ELMo embeddings.
-
Experiment configurations are found in
experiments.conf
-
The parameter
main_metrics
can be selected from coref, ner, relation or any combination of the three, such as coref_ner_relation which indicates the F1 score for averaged F1 score for coref, ner and relation. The model is tuned and saved based on the resulting averaged F1 score. -
The parameters
ner_weight
,coref_weight
andrelation_weight
are weights for the multi-task objective. If set the weight to 0 then the task is not trained. -
Choose an experiment that you would like to run, e.g.
scientific_best_ner
-
For a single-machine experiment, run the following two commands in parallel:
python singleton.py <experiment>
python evaluator.py <experiment>
-
Results are stored in the
logs
directory and can be viewed via TensorBoard. -
For final evaluation of the checkpoint with the maximum dev F1:
python test_single.py <experiment>
- It does not use GPUs by default. Instead, it looks for the
GPU
environment variable, which the code treats as shorthand forCUDA_VISIBLE_DEVICES
. - The evaluator should not be run on GPUs, since evaluating full documents does not fit within GPU memory constraints.
- The training runs indefinitely and needs to be terminated manually.
- Define the output path in experiments.conf as output_path, the system will output the results of eval_path to output_path. The output file is also a json file, which has thesame format as eval_path. Then run
python write_single.py <experiment>
- Best models can be downloaded from here (Best NER Model,Best Coref Model,Best Relation Model), unzip and put the model under ./logs . For making predictings or testing results use the same command as the previous steps.