Name		Name	Last commit message	Last commit date
parent directory ..
LICENSE		LICENSE
README.md		README.md
dev_uee_221212_ann.json		dev_uee_221212_ann.json
dev_uee_221212_txt.json		dev_uee_221212_txt.json
merge-ann-and-txt-uee.py		merge-ann-and-txt-uee.py
merge.sh		merge.sh
test_uee_221212_ann.json		test_uee_221212_ann.json
test_uee_221212_txt.json		test_uee_221212_txt.json
train_uee_221212_ann.json		train_uee_221212_ann.json
train_uee_221212_txt_00.json		train_uee_221212_txt_00.json
train_uee_221212_txt_01.json		train_uee_221212_txt_01.json
train_uee_221212_txt_02.json		train_uee_221212_txt_02.json
train_uee_221212_txt_03.json		train_uee_221212_txt_03.json

README.md

The UEE Dataset

Description

This directory contains the UEE dataset, which is described in our ACL 2023 Industry Track paper "Hunt for Buried Treasures: Extracting Unclaimed Embodiments from Patent Specifications". Refer to Section 3 "Dataset" of the paper for details.

Preparation

Each of the training, development, and test sets of the UEE dataset consists of annotation part and text part. Before using the UEE dataset, merge the two parts in the following way:

sh merge.sh

This will create the following files, which are the merged version of the training, development, and test set of the UEE dataset.

train_uee_221212.json
dev_uee_221212.json
test_uee_221212.json

Files

README.md: This file.
train_uee_221212_ann.json: The annotation part of the training set.
train_uee_221212_txt_{00,01,02,03}.json: The text part of the training set (split into 4 parts).
dev_uee_221212_ann.json: The annotation part of the development set.
dev_uee_221212_txt.json: The text part of the development set.
test_uee_221212_ann.json: The annotation part of the test set.
test_uee_221212_txt.json: The text part of the test set.
merge.sh: The shell script to merge the annotation and text parts of each set.
merge-ann-and-txt-uee.py: The python script used by merge.sh
LICENSE: About the license of this dataset.

License

See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uee_dataset

uee_dataset

README.md

The UEE Dataset

Description

Preparation

Files

License

Files

uee_dataset

Directory actions

More options

Directory actions

More options

Latest commit

History

uee_dataset

Folders and files

parent directory

README.md

The UEE Dataset

Description

Preparation

Files

License