Annotation guideline #181

Shreyanand · 2022-07-14T15:59:35Z

Just a note that the readme file does mention the need for annotation files, but provides no guidance as to how to annotate for the given models. It would be extremely helpful to have links to documents about the dos and don'ts of annotation:

how much extra contextual information (paragraphs, sentences) are too much or too little?
Does the page reference indicate the start of the KPI to be extracted or the start of the paragraph providing context for the KPI?
Given that the page number is bracketed, are we to interpret that pages can be ranges, and that context spanning multiple pages should convey that whole range? Can discontiguous KPIs be extracted, or must they be contiguous? For example, can "100 metrics tonnes CO2e" be extracted from "How many tonnes of CO2e were emitted? Answer: 100."?

The last question is obviously domain-specific, but there are many general questions not answered with respect to the annotation process, which is critical to the actual demo.

Originally posted by @MichaelTiemannOSC in #176 (comment)

Shreyanand added the nlp-internal Indicates that the issue exists to improve the internal NLP model and it's code label Jul 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotation guideline #181

Annotation guideline #181

Shreyanand commented Jul 14, 2022

Annotation guideline #181

Annotation guideline #181

Comments

Shreyanand commented Jul 14, 2022