diff --git a/README.md b/README.md index f5ea3bad..d34f0770 100644 --- a/README.md +++ b/README.md @@ -2,25 +2,25 @@ # In a Nutshell: Qanary Question Answering Components -The [Qanary Framework](https://github.com/WDAqua/Qanary/) is dedicated to create Question Answering systems. Question Answering (QA) is a task requiring different fields leading to expensive / time consuming engineering tasks which might block research as it is too expensive. Typical problems/usecases that might occur while developing a question answering system are: +The [Qanary Framework](https://github.com/WDAqua/Qanary/) is dedicated to creating Question Answering systems. Question Answering (QA) is a task requiring different fields leading to expensive/time-consuming engineering tasks that might block research as it is too expensive. Typical problems/use cases that might occur while developing a Question Answering system are: - * an algorithm is require analyzing textual questions and annotating the found entities, relations, classes, etc. - * it is time consuming as there are many services / algorithms / tools which need to compared + * an algorithm requires analyzing textual questions and annotating the found entities, relations, classes, etc. + * it is time-consuming as many services/algorithms/tools need to be compared * your QA process needs to be improved * following traditional development approaches requires additional efforts for testing and debugging of code to uncover possible flaws - * the quality of components dedicated to a particular task need to be analyzed + * the quality of components dedicated to a particular task needs to be analyzed * it is expensive to integrate all of the particular components due to a missing generalized interface -In this repository, the [components of the Qanary framework](https://github.com/WDAqua/Qanary-question-answering-components) are stored. All components are implemented in Java and provide a Docker container for lightweight maintaince. +In this repository, the [components of the Qanary framework](https://github.com/WDAqua/Qanary-question-answering-components) are stored. All components are implemented in Java and provide a Docker container for lightweight maintenance. ## Build and run a *minimal* set of components -To show the Qanary methodology and it's functionality a tiny template-based Question Answering system was designed. It is capable of answering questions for a *real name* of a superhero like "What is the real name of Captain America?". For this purpose just two components were used: +To show the Qanary methodology and its functionality a tiny template-based Question Answering system was designed. It is capable of answering questions for the *real name* of a superhero like "What is the real name of Captain America?". For this purpose, just two components were used: a) [Qanary DBpedia Spotlight component](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-DBpediaSpotlight): The component is capable of finding superhero names and linking it to the [DBpedia knowledge base](https://wiki.dbpedia.org/) (such a process is called Named Entity Recognition and Disambiguation). b) [Qanary Query Builder for Superhero Names](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-QB-SimpleRealNameOfSuperHero): The component is capable of creating [SPARQL](https://www.w3.org/TR/sparql11-overview/) SELECT queries to be executed on DBpedia (such a component is typically called Query Builder) if the given question is following the template `What is the real name of `. -Hence, given a question following the described pattern the result will be SPARQL query that might be executed, s.t., the real name of a superhero is retrieved from DBpedia. +Hence, given a question following the described pattern the result will be a SPARQL query that might be executed, s.t., the real name of a superhero is retrieved from DBpedia. ### Run a minimalistic Question Answering system @@ -80,15 +80,15 @@ WHERE { ## Big Picture - * Qanary is providing the methodology for a knowledge-driven, vocabular-based approach. Our long-term agenda is to create a knowledge-driven ecosystem for the field of Question Answering. It is part of the [WDAqua project](http://wdaqua.eu) where question answering systems are researched and developed. - * [Qanary Framework](https://github.com/WDAqua/Qanary/) is providing the core framework for creating Question Answering systems following the Qanary methodology. You might consider the Qanary Framework as reference implementation of the Qanary framework as microserivce-based component architecture. + * Qanary provides the methodology for a knowledge-driven, vocabulary-based approach. Our long-term agenda is to create a knowledge-driven ecosystem for the field of Question Answering. It is part of the [WDAqua project](http://wdaqua.eu) where Question Answering systems are researched and developed. + * [Qanary Framework](https://github.com/WDAqua/Qanary/) is providing the core framework for creating Question Answering systems following the Qanary methodology. You might consider the Qanary Framework as a reference implementation of the Qanary framework as a microservice-based component architecture. * [Qanary components](https://github.com/WDAqua/Qanary-question-answering-components) is covering the QA components compatible with the Qanary framework. * [Frankenstein](https://github.com/WDAqua/Frankenstein) is a supporting framework to establish a toolset for rapid orchestration and benchmarking of Qanary components. For example, it provides the tools to create from 29 components 380 QA systems. -Regarding questions, ideas or any feedback related to Qanary please do not hesitate to [contact the core developers](https://github.com/WDAqua/Qanary/wiki/Who-do-I-talk-to%3F). However, if you like to see a QA system originally built using the Qanary framework, one of our core developers has build a complete end-to-end QA system which allows to query several RDF data stores: [http://wdaqua.eu/qa](http://wdaqua.eu/qa). +Regarding questions, ideas, or any feedback related to Qanary please do not hesitate to [contact the core developers](https://github.com/WDAqua/Qanary/wiki/Who-do-I-talk-to%3F). However, if you would like to see a QA system originally built using the Qanary framework, one of our core developers has built a complete end-to-end QA system that allows you to query several RDF data stores: [http://wdaqua.eu/qa](http://wdaqua.eu/qa). -Please go to the [GitHub Wiki page](https://github.com/WDAqua/Qanary/wiki) of Qanary repository to get more insights on how to use this framework, how to add new component etc. +Please go to the [GitHub Wiki page](https://github.com/WDAqua/Qanary/wiki) of the Qanary repository to get more insights on how to use this framework, how to add new components etc. ## How to Cite @@ -122,12 +122,12 @@ The following components are contained in the ### Question Answering Name Entity Recognition (NER) and Disambiguation Components (NED) Components #### Entity Classifier 2 (NER) -It uses rule base grammar to extract entities in a text. +It uses rule-based grammar to extract entities in a text. * [Qanary Entity Classifier 2 for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-EntityClassifier2) #### Stanford NLP Tool (NER) -Stanford named entity recogniser is an open source tool that uses Gibbs sampling for information extraction to spot entities in a text. +Stanford named entity recognizer is an open-source tool that uses Gibbs sampling for information extraction to spot entities in a text. * [Qanary Stanford NLP Tool for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-Stanford) @@ -139,19 +139,19 @@ is a multilingual, graph-based approach that uses random walks and the densest s #### AGDISTIS (NED) -It is a graph based disambiguation tool that couples the HITS algorithm with label expansion strategies and string similarity measures to disambiguate entities in a given text. +It is a graph-based disambiguation tool that couples the HITS algorithm with label expansion strategies and string similarity measures to disambiguate entities in a given text. * [Qanary AGDISTIS for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-AGDISTIS) #### DBpedia Spotlight -It is a web service that uses vector-space representation of entities and using the cosine similarity, recognise and disambiguate the entities. +It is a web service that uses a vector-space representation of entities and using the cosine similarity, recognize and disambiguate the entities. * [Qanary DBpedia Spotlight for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-DBpediaSpotlight) * [Qanary DBpedia Spotlight for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-DBpediaSpotlight) #### Tag Me -It matches terms in a given text with Wikipedia, \ie links text to recognise named entities. - Furthermore, it uses the in-link graph and the page dataset to disambiguate recognised entities to its Wikipedia URls. +It matches terms in a given text with Wikipedia, \ie links text to recognize named entities. + Furthermore, it uses the in-link graph and the page dataset to disambiguate recognized entities to its Wikipedia URIs. * [Qanary Tag Me for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-Tagme) * [Qanary Tag Me for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-Tagme) @@ -162,13 +162,13 @@ It matches terms in a given text with Wikipedia, \ie links text to recognise nam * Dandelion ([homepage](https://dandelion.eu/)) is a startup specialized in Semantics & Big Data. * [Qanary Dandelion for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-Dandelion) * [Qanary Dandelion for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-Dandelion) - * Ontotext ([homepage](https://ontotext.com/)) provides a complete set of Semantic Technology enabling better content management, knowledge discovery and semantic search. + * Ontotext ([homepage](https://ontotext.com/)) provides a complete set of Semantic Technologies enabling better content management, knowledge discovery, and semantic search. * [Qanary Ontotext for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-Ontotext) * [Qanary Ontotext for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-Ontotext) - * Ambiverse ([homepage](https://www.ambiverse.com/)) is a spin-off from the Max Planck Institute for Informatics, develops technologies to automatically understand, analyze, and manage Big Text collections. + * Ambiverse ([homepage](https://www.ambiverse.com/)) is a spin-off from the Max Planck Institute for Informatics, which develops technologies to automatically understand, analyze, and manage Big Text collections. * [Qanary Ambiverse for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-Ambiverse) * [Qanary Ambiverse for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-Ambiverse) - * Meaningcloud ([homepage](https://www.meaningcloud.com/)) is a company based in New York City, specialized in software for semantic analysis. + * Meaningcloud ([homepage](https://www.meaningcloud.com/)) is a company based in New York City, that specializes in software for semantic analysis. * [Qanary Meaningcloud for *NED*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NED-MeaningCloud) * [Qanary Meaningcloud for *NER*](https://github.com/WDAqua/Qanary-question-answering-components/tree/master/qanary-component-NER-MeaningCloud)