-
Notifications
You must be signed in to change notification settings - Fork 25
Qanary tutorial: How to build a trivial Question Answering pipeline
NOTE: Some resources used in this tutorial are outdated and no longer produce the expected results!
This tutorial is the short version of the tutorial we gave at the ESWC 2018 called "Build a Question Answering System Overnight" (original resources).
The following steps are aiming at creating a triplestore based on existing components being capable of creating a simple Question Answering pipeline using the Qanary framework. After this tutorial, you should be enabled to analyze the created data and to extend the Question Answering system for your needs.
- Java 8+
- Maven 3+
- Git client
The Qanary methodology aims at storing all knowledge produces while computing an answer for a given users question in a Knowledge Base (triplestore). Here we use Stardog. However, any triplestore should work.
Preparation of the Triplestore Stardog: Download Stardog via stardog.com - click "Download" (top of the page). For non-commercial use it is free, but you will still need a license file which you will receive via email.
Starting the Stardog Triplestore: We use Stardog as local triplestore to store all the output, and inputs in QA process. After unpacking the code and storing the license file you can start Stardog by switching within a terminal window into the Stardog subfolder bin
folder and run:
stardog-admin server start
If everything was going well you need to create a Stardog database called qanary
using the following command:
stardog-admin db create -n qanary
Testing the Stardog Triplestore: To check if it's working, now log in to the triplestore.
By default Stardog is available at http://localhost:5820, call this URL in your browser.
You will be asked for a username and password, by default it is set to: admin
/ admin
After the login, you should see a database called qanary
.
Clone the project from the GitHub repository:
git clone https://github.com/WDAqua/Qanary
Then execute the Maven build process (within the Qanary
folder):
mvn install -DskipDockerBuild
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] qanary-component-archetype ......................... SUCCESS [ 1.161 s]
[INFO] qa.pipeline ........................................ SUCCESS [ 2.992 s]
[INFO] qa.component ....................................... SUCCESS [ 0.430 s]
[INFO] qald.evaluator ..................................... SUCCESS [ 0.761 s]
[INFO] mvn.reactor ........................................ SUCCESS [ 0.012 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
The install goal will compile, test, and package the Qanary framework code and then copy it into the local dependency repository. Thereafter the Qanary component foundations are available as well as the Qanary pipeline template.
Note: The above command build the project without generating corresponding Docker containers. Done here due to time restrictions.
Now to start the QA process, the first step is the run the main QA pipeline. The dedicated predefined implementation is named: qanary_pipeline-template
Run the following command to start a default Qanary pipeline within a terminal window (within the Qanary
folder):
java -jar qanary_pipeline-template/target/qa.pipeline-X.Y.Z.jar
Where X
, Y
, and Z
are referring to the current version of the Qanary pipeline template.
Note in Java 9+ you need to run: java -jar --add-modules=java.se.ee qanary_pipeline-template/target/qa.pipeline-X.Y.Z.jar
If the server is running, you can check this by calling the Web-based Admin interface of the integrated Spring Boot Admin Server, by default it is available via http://127.0.0.1:8080/
A default Qanary system was started as a server so keep this terminal and do not close it.
Clone the repository of Qanary question answering components:
git clone https://github.com/WDAqua/Qanary-question-answering-components
30+ question answering components based on the Qanary framework are now available locally.
Due to time restrictions, we are just building now the three required components as JAR files for finishing the tutorial (using our Maven tutorial
profile) and without creating the corresponding Docker containers (due to time restrictions):
mvn install -DskipDockerBuild -P tutorial
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] qanary_component-NED-Ambiverse ..................... SUCCESS [ 4.588 s]
[INFO] qa.qanary_component-DiambiguationProperty-OKBQA .... SUCCESS [ 0.734 s]
[INFO] qa.qanary_component-QueryBuilder ................... SUCCESS [ 1.889 s]
[INFO] mvn.reactor ........................................ SUCCESS [ 0.018 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
Note: You can build all components including the Docker containers using the command mvn install -DskipDockerBuild
.
We suggest starting the created JAR files in separate terminals.
Terminal 1: Start the JAR of the Ambiverse NED component:
java -jar qanary_component-NED-Ambiverse/target/qanary_component-NED-Ambiverse-X.Y.Z.jar
Terminal 2: Start the Relation Linking component named DisambiguationProperty-OKBQA:
java -jar qa.qanary_component-DiambiguationProperty-OKBQA/target/qa.qanary_component-DiambiguationProperty-OKBQA-X.Y.Z.jar
Terminal 3: Start the created component QueryBuilder using following the command:
java -jar qa.qanary_component-QueryBuilder/target/qa.qanary_component-QueryBuilder-X.Y.Z.jar
Note: Using Java 9+ you might need to add --add-modules=java.se.ee
, e.g., java -jar --add-modules=java.se.ee qa.qanary_component-QueryBuilder/target/qa.qanary_component-QueryBuilder-X.Y.Z.jar
.
After some seconds the components registered themselves to the Qanary pipeline service we had started previously. To check this go to http://127.0.0.1:8080/ (default configuration) using a browser. You see the Spring Boot Admin Server Web Interface showing all registered components. There should be 3 by now.
Note: If you changed the port or server of you Qanary system (see Step 2), then you need to set this configuration also for the Qanary components, e.g., via editing the corresponding application.properties
of each component to be executed.
Now go to our trivial Web UI for testing the functionality: http://localhost:8080/startquestionansweringwithtextquestion
You will see the three components you have started in Step 3 appearing in a list. Select or re-order the components, s.t., QueryBuilder
is the last component. Insert the question "Name the municipality of Roberto Clemente Bridge." and hit the button start QA process provided by Qanary
.
If everything worked, then you get a response a JSON representation it contains a link to endpoint and a graphID stored at the property ingraph
, too.
Note: As the components Ambiverse NED component and DisambiguationProperty-OKBQA point to external services, you will require an active Internet connection.
Switch to the endpoint (in your browser). Click on "query" and run the following SPARQL query while reusing the graphID from the previous step:
SELECT * FROM <graphID> WHERE { ?s ?p ?o . }
Now, you see all the information created by the calls to the 3 Qanary components while analyzing the question "Name the municipality of Roberto Clemente Bridge.". While switching to the last page of the query result set you can see specific information while the earlier ones contain basic vocabulary definitions. Alternatively, you can use specific SPARQL queries to retrieve specific information like the following:
You can see 4 annotations (including time of annotation) are created by 3 components using the following SPARQL query:
PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT * FROM <graphID> {
?s oa:annotatedBy ?activeComponentsDuringTheProcessingOfTheQuestion .
?s oa:annotatedAt ?time .
}
Using the following SPARQL query you can see that this annotation holds the information that dbpedia:Roberto_Clemente_Bridge
was created by the Ambiverse component and is targeting a oa:textselector
(which points to the character positions 28-50).
PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT * FROM <graphID> {
?s oa:annotatedBy <urn:qanary.NED#https://api.ambiverse.com/v2/entitylinking/analyze> .
?s ?p ?o
}
You retrieved a subgraph holding the created annotations using the following SPARQL query. Now you can explore more ...
PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT DISTINCT * FROM <graphID> {
?s oa:annotatedBy <urn:qanary.NED#https://api.ambiverse.com/v2/entitylinking/analyze> .
?s ?p ?o .
?s oa:hasTarget ?s2 .
?s2 ?p2 ?o2 .
}
You have created a simple Question Answering system using the existing Qanary Standard Pipeline and the Qanary Question Answering components. Simple questions can be answered now. Following you should test further questions, check other components of the Qanary ecosystem, and implement you own component.
We are happy to answer any of your questions. Please do not hesitate to contact us.
Note: As the Qanary framework does not contain a complete UI at the moment, you might implement this by yourself covering your specific needs or check out Trill.
-
How to establish a Docker-based Qanary Question Answering system
-
How to implement a new Qanary component
... using Java?
... using Python (Qanary Helpers)?
... using Python (plain Flask service)?