Skip to content

Qanary tutorial: How to build a trivial Question Answering pipeline

Paul Heinze edited this page Sep 6, 2022 · 4 revisions

NOTE: Some resources used in this tutorial are outdated and no longer produce the expected results!

This tutorial is the short version of the tutorial we gave at the ESWC 2018 called "Build a Question Answering System Overnight" (original resources).

The following steps are aiming at creating a triplestore based on existing components being capable of creating a simple Question Answering pipeline using the Qanary framework. After this tutorial, you should be enabled to analyze the created data and to extend the Question Answering system for your needs.

Requirements

  • Java 8+
  • Maven 3+
  • Git client

Step 1: Prepare Triplestore

The Qanary methodology aims at storing all knowledge produces while computing an answer for a given users question in a Knowledge Base (triplestore). Here we use Stardog. However, any triplestore should work.

Preparation of the Triplestore Stardog: Download Stardog via stardog.com - click "Download" (top of the page). For non-commercial use it is free, but you will still need a license file which you will receive via email.

Starting the Stardog Triplestore: We use Stardog as local triplestore to store all the output, and inputs in QA process. After unpacking the code and storing the license file you can start Stardog by switching within a terminal window into the Stardog subfolder bin folder and run:

stardog-admin server start

If everything was going well you need to create a Stardog database called qanary using the following command:

stardog-admin db create -n qanary

Testing the Stardog Triplestore: To check if it's working, now log in to the triplestore. By default Stardog is available at http://localhost:5820, call this URL in your browser. You will be asked for a username and password, by default it is set to: admin / admin After the login, you should see a database called qanary.

Step 2: Preparing the Qanary Framework and Running the Default QA System

Step 2.1: Build of Qanary Framework

Clone the project from the GitHub repository:

git clone https://github.com/WDAqua/Qanary

Then execute the Maven build process (within the Qanary folder):

mvn install  -DskipDockerBuild

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] qanary-component-archetype ......................... SUCCESS [  1.161 s]
[INFO] qa.pipeline ........................................ SUCCESS [  2.992 s]
[INFO] qa.component ....................................... SUCCESS [  0.430 s]
[INFO] qald.evaluator ..................................... SUCCESS [  0.761 s]
[INFO] mvn.reactor ........................................ SUCCESS [  0.012 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

The install goal will compile, test, and package the Qanary framework code and then copy it into the local dependency repository. Thereafter the Qanary component foundations are available as well as the Qanary pipeline template.

Note: The above command build the project without generating corresponding Docker containers. Done here due to time restrictions.

Step 2.2: Executing the Qanary Template System

Now to start the QA process, the first step is the run the main QA pipeline. The dedicated predefined implementation is named: qanary_pipeline-template

Run the following command to start a default Qanary pipeline within a terminal window (within the Qanary folder):

java -jar qanary_pipeline-template/target/qa.pipeline-X.Y.Z.jar

Where X, Y, and Z are referring to the current version of the Qanary pipeline template.

Note in Java 9+ you need to run: java -jar --add-modules=java.se.ee qanary_pipeline-template/target/qa.pipeline-X.Y.Z.jar

If the server is running, you can check this by calling the Web-based Admin interface of the integrated Spring Boot Admin Server, by default it is available via http://127.0.0.1:8080/

A default Qanary system was started as a server so keep this terminal and do not close it.

Step 3: Preparing the Qanary Question Answering Components

Step 3.1: Build of Qanary components

Clone the repository of Qanary question answering components:

git clone https://github.com/WDAqua/Qanary-question-answering-components

30+ question answering components based on the Qanary framework are now available locally.

Due to time restrictions, we are just building now the three required components as JAR files for finishing the tutorial (using our Maven tutorial profile) and without creating the corresponding Docker containers (due to time restrictions):

mvn install -DskipDockerBuild -P tutorial

[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO] 
[INFO] qanary_component-NED-Ambiverse ..................... SUCCESS [  4.588 s]
[INFO] qa.qanary_component-DiambiguationProperty-OKBQA .... SUCCESS [  0.734 s]
[INFO] qa.qanary_component-QueryBuilder ................... SUCCESS [  1.889 s]
[INFO] mvn.reactor ........................................ SUCCESS [  0.018 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------

Note: You can build all components including the Docker containers using the command mvn install -DskipDockerBuild.

Step 3.2: Run the Tutorial Components

We suggest starting the created JAR files in separate terminals.

Terminal 1: Start the JAR of the Ambiverse NED component:

java -jar qanary_component-NED-Ambiverse/target/qanary_component-NED-Ambiverse-X.Y.Z.jar

Terminal 2: Start the Relation Linking component named DisambiguationProperty-OKBQA:

java -jar qa.qanary_component-DiambiguationProperty-OKBQA/target/qa.qanary_component-DiambiguationProperty-OKBQA-X.Y.Z.jar

Terminal 3: Start the created component QueryBuilder using following the command:

java -jar qa.qanary_component-QueryBuilder/target/qa.qanary_component-QueryBuilder-X.Y.Z.jar

Note: Using Java 9+ you might need to add --add-modules=java.se.ee, e.g., java -jar --add-modules=java.se.ee qa.qanary_component-QueryBuilder/target/qa.qanary_component-QueryBuilder-X.Y.Z.jar.

After some seconds the components registered themselves to the Qanary pipeline service we had started previously. To check this go to http://127.0.0.1:8080/ (default configuration) using a browser. You see the Spring Boot Admin Server Web Interface showing all registered components. There should be 3 by now.

Note: If you changed the port or server of you Qanary system (see Step 2), then you need to set this configuration also for the Qanary components, e.g., via editing the corresponding application.properties of each component to be executed.

Step 4: Run a QA pipeline and Analyze the Created Information

Step 4.1: Execute a trivial pipeline using a trivial question

Now go to our trivial Web UI for testing the functionality: http://localhost:8080/startquestionansweringwithtextquestion

You will see the three components you have started in Step 3 appearing in a list. Select or re-order the components, s.t., QueryBuilder is the last component. Insert the question "Name the municipality of Roberto Clemente Bridge." and hit the button start QA process provided by Qanary.

If everything worked, then you get a response a JSON representation it contains a link to endpoint and a graphID stored at the property ingraph, too.

Note: As the components Ambiverse NED component and DisambiguationProperty-OKBQA point to external services, you will require an active Internet connection.

Step 4.2: Analyze the created information

Switch to the endpoint (in your browser). Click on "query" and run the following SPARQL query while reusing the graphID from the previous step:

SELECT * FROM <graphID> WHERE { ?s ?p ?o . }

Now, you see all the information created by the calls to the 3 Qanary components while analyzing the question "Name the municipality of Roberto Clemente Bridge.". While switching to the last page of the query result set you can see specific information while the earlier ones contain basic vocabulary definitions. Alternatively, you can use specific SPARQL queries to retrieve specific information like the following:

You can see 4 annotations (including time of annotation) are created by 3 components using the following SPARQL query:

PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT * FROM <graphID> { 
    ?s oa:annotatedBy ?activeComponentsDuringTheProcessingOfTheQuestion . 
    ?s oa:annotatedAt ?time . 
}

Using the following SPARQL query you can see that this annotation holds the information that dbpedia:Roberto_Clemente_Bridge was created by the Ambiverse component and is targeting a oa:textselector (which points to the character positions 28-50).

PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT * FROM <graphID> { 
    ?s oa:annotatedBy <urn:qanary.NED#https://api.ambiverse.com/v2/entitylinking/analyze> . 
    ?s ?p ?o
}

You retrieved a subgraph holding the created annotations using the following SPARQL query. Now you can explore more ...

PREFIX oa: <http://www.w3.org/ns/openannotation/core/>
SELECT DISTINCT * FROM <graphID> { 
    ?s oa:annotatedBy <urn:qanary.NED#https://api.ambiverse.com/v2/entitylinking/analyze> . 
    ?s ?p ?o .
    ?s oa:hasTarget ?s2 .
    ?s2 ?p2 ?o2 .
}

Conclusions and Next Steps

You have created a simple Question Answering system using the existing Qanary Standard Pipeline and the Qanary Question Answering components. Simple questions can be answered now. Following you should test further questions, check other components of the Qanary ecosystem, and implement you own component.

We are happy to answer any of your questions. Please do not hesitate to contact us.

Note: As the Qanary framework does not contain a complete UI at the moment, you might implement this by yourself covering your specific needs or check out Trill.

Clone this wiki locally