-
Notifications
You must be signed in to change notification settings - Fork 10
The rdf2neo Converter
The rdf2neo converter expects configuration details (in its Spring file) about the Neo4j database where to send the data resulting from the RDF mapping.
Note that a specific subclass of ConfigItem
is used in these files, which adds support for Neo4j indexing (see below).
This is done by configuring a bean named neoDriver
as an instance of the Neo4j's driver class, as it is shown in the test/example files.
This is a technique we use in one of our rdf2neo-based tools, you can refer to system properties when defining Neo4j connection parameters:
<bean id = "neoDriver"
class = "org.neo4j.driver.GraphDatabase" factory-method = "driver"
scope = "pgmakerSession">
<constructor-arg value = "#{systemProperties[ 'neo4j.boltUrl'] ?: 'bolt://127.0.0.1:7687'}" />
<constructor-arg>
<bean class = "org.neo4j.driver.AuthTokens" factory-method = "basic">
<constructor-arg value = "#{systemProperties[ 'neo4j.user'] ?: 'neo4j' }" />
<constructor-arg value = "#{systemProperties[ 'neo4j.password'] ?: 'test' }" />
</bean>
</constructor-arg>
</bean>
This way, you can inject Java properties via JVM options:
# This var is always understand by the JVM
export JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -Dneo4j.boltUrl='bolt://someserver.in.the.net:6787'"
export JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -Dneo4j.user='myuser'"
export JAVA_TOOL_OPTIONS="$JAVA_TOOL_OPTIONS -Dneo4j.password='mysecretpass'"
This might be easier if you run rdf2neo (or other rdf2pg tools) from other scripts and against multiple servers. System environment properties are an alternative approach.
By default, the only Cypher node/relation property that rdf2neo indexes is iri
. You should decide which other properties should be indexed in your application, in order to optimise performance. Cypher indexes can be configured inside the ConfigItem
, as shown in example configs. Here you can find an example of the SPARQL query to be used to define the property names you want to index. As you can see, there is a special syntax to specify if a property is about nodes, nodes with a specific label, relations or relations having a given type.
rdf2neo expects an empty database when it is run, or at least a database where the nodes and relations you are going to import from RDF are not already there (e.g., no node with same URIs, or no node with the same labels).
You can do cleanup or pre/post processing operations against Neo4j by invoking the Neo4j Cypher Shell in scripts of yours (which will invoke rdf2neo too).
A faster alternative to reset a database is deleting its data files, but beware of these issues with recent versions.