Dependency Forest for Medical Relation Extraction

Code corresponding to our paper "Leveraging Dependency Forest for Neural Medical Relation Extraction" at EMNLP 2019

Introduction

Folder "biaffine_forest" is a deep biaffine parser that supports producing dependency forests as introduced in our paper. It is obtained by adapting the original dozat-parser. To generate forests, use "--nbest" or "--cubesparse" instead of the traditional "--test" option when decoding. "--nbest" and "--cubesparse" correspond to our "KbestEisner" and "Edgewise" method, respectively. Parser training remains the same as the original system. As mentioned by the original system description, this parser only works on Python 2 and TF 0.1.2, a very old version.

Folder "re_forest_grn" is our main relation extraction (RE) system based on graph recurrent network (GRN) for consuming dependency trees/forests. There are several scripts within the "re_forest_grn/data" folder to help generating necessary data, such as word embeddings. Training and decoding shells are also provided. This model is based on Python 2 and TF 1.8.0.

Data

Since the orignal website for data obtaining is not available, you may download the data through CPR and PGR. I also attach my script for preprocessing. It uses several handcrafted rules to pre-tokenize some special entities, which cannot be tokenized by a standard tokenizer. But, we just realized recently that this could be done in a smarter way that simply pre-tokenizes with the already provided character-based positions. You may improve this tokenizer for better performances.

Cite

If our work helps your research/system, please cite our work with the following bibtex file.

@inproceedings{song-etal-2019-leveraging,
    title = "Leveraging Dependency Forest for Neural Medical Relation Extraction",
    author = "Song, Linfeng  and
      Zhang, Yue  and
      Gildea, Daniel  and
      Yu, Mo  and
      Wang, Zhiguo  and
      Su, Jinsong",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)",
    year = "2019",
    doi = "10.18653/v1/D19-1020",
    pages = "208--218"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Dependency Forest for Medical Relation Extraction

Introduction

Data

Cite

Files

README.md

Latest commit

History

README.md

File metadata and controls

Dependency Forest for Medical Relation Extraction

Introduction

Data

Cite