Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation guideline #181

Open
Shreyanand opened this issue Jul 14, 2022 · 0 comments
Open

Annotation guideline #181

Shreyanand opened this issue Jul 14, 2022 · 0 comments
Labels
nlp-internal Indicates that the issue exists to improve the internal NLP model and it's code

Comments

@Shreyanand
Copy link
Member

Just a note that the readme file does mention the need for annotation files, but provides no guidance as to how to annotate for the given models. It would be extremely helpful to have links to documents about the dos and don'ts of annotation:

  • how much extra contextual information (paragraphs, sentences) are too much or too little?
  • Does the page reference indicate the start of the KPI to be extracted or the start of the paragraph providing context for the KPI?
  • Given that the page number is bracketed, are we to interpret that pages can be ranges, and that context spanning multiple pages should convey that whole range? Can discontiguous KPIs be extracted, or must they be contiguous? For example, can "100 metrics tonnes CO2e" be extracted from "How many tonnes of CO2e were emitted? Answer: 100."?

The last question is obviously domain-specific, but there are many general questions not answered with respect to the annotation process, which is critical to the actual demo.

Originally posted by @MichaelTiemannOSC in #176 (comment)

@Shreyanand Shreyanand added the nlp-internal Indicates that the issue exists to improve the internal NLP model and it's code label Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
nlp-internal Indicates that the issue exists to improve the internal NLP model and it's code
Projects
None yet
Development

No branches or pull requests

1 participant