This demonstrator showcases the capabilities of the RML+Solid pipeline presented in the paper 'Extending RML to Support Permissioned Data Sharing'.
The RML+Solid pipeline takes heterogeneous data sources and an extended RML mapping as input. Following RML extensions are included in the extended RML mapping:
The extended RML mapping is executed with RMLMapper v7.2.0 as RML processor.
The extended RML mapping defines logical targets for resources on a Solid pod hosted by a Community Solid Server v7.1.3.
The demonstrator simulates how three manufacturers share their data with ten users, each with distinct access rights to the manufacturers' data.
Per manufacturer, we created
(i) source data about products and their properties:
./manufacturer1/data/,
./manufacturer2/data/, and
./manufacturer3/data/;
(ii) a CSV file to manage the access control:
./manufacturer1/data/read_access.csv,
./manufacturer2/data/read_access.csv, and
./manufacturer3/data/read_access.csv;
(iii) an extended RML mapping:
./manufacturer1/mapping.rml.ttl,
./manufacturer2/mapping.rml.ttl and
./manufacturer3/mapping.rml.ttl;
and (iv) and a Solid pod hosted on a Solid Community Server: ./CommunitySolidServer/pods/manufacturer1/,
./CommunitySolidServer/pods/manufacturer2/, and
./CommunitySolidServer/pods/manufacturer3/,
For the users, who get read access to selected parts of the manufacturers' data, we created ten additional Solid pods: ./CommunitySolidServer/pods/userX/ .
The authentication details for the manufacturers and users adhere to following pattern (X should be replaced the manufacturer's or user's number):
password | webId | oidcIssuer | |
---|---|---|---|
[email protected] | abc123 | http://localhost:3000/manufacturerX/profile/card#me | http://localhost:3000/ |
[email protected] | abc123 | http://localhost:3000/userX/profile/card#me | http://localhost:3000/ |
With the RML+Solid pipeline, we execute the extended RML mappings to convert the source data and access control data to RDF data and to publish the RDF data on the Solid pods of the manufacturers.
- a bash shell
- Java version 17 (We tested with OpenJDK v17.0.2)
- Docker Engine
- Node (We tested with Node v20.00.0)
To avoid any library conflicts, especially with Comunica, we start the Communtiy Solid server as a Docker. To start the Community Solid Server, run following command:
cd ./CommunitySolidServer
docker run --rm -v $(pwd)/config:/config -v $(pwd)/pods:/pods -p 3000:3000 solidproject/community-server -c /config/file.json --seedConfig /config/seeded-pod-config.json -f /pods --name CSS
Note for Windows users: Using $(pwd)
won't just work to get the "present working dir". Here are a few alternatives:
- MinGW / git bash: use
/$(pwd)
- Windows command line (cmd):
%cd%
- PowerShell:
${PDW}
Example for PowerShell:
docker run --name CSS --rm -v ${PWD}/config:/config -v ${PWD}/pods:/pods -p 3000:3000 solidproject/community-server -c /config/file.json --seedConfig /config/seeded-pod-config.json -f /pods
The configuration of the Solid pods can be adapted in this file: ./CommunitySolidServer/config/seeded-pod-config.json.
In this repository contains the state of the Solid pods after executing the RML mappings.
This allows us to refer to specific resources on the Solid pods to explain the features of the RML+Solid pipeline.
To restart from scratch, stop the docker, delete the content of the folder ./CommunitySolidServer/pods, restart the CommunitySolidServer,and execute the RML mappings with RMLMapper.
In a new terminal:
docker stop CSS
cd ./CommunitySolidServer
rm -r ./pods/
docker run --name CSS --rm -v $(pwd)/config:/config -v $(pwd)/pods:/pods -p 3000:3000 solidproject/community-server -c /config/file.json --seedConfig /config/seeded-pod-config.json -f /pods
- Download RMLMapper v7.2.0 as
rmlmapper.jar
in this folder - In a new terminal execute the extended RML mapping of the three manufacturers (this may take some minutes).
cd ./manufacturer1
echo 'Executing mapping manufacturer1...'
java -jar ../rmlmapper.jar -m mapping.rml.ttl -d
cd ../manufacturer2
echo 'Executing mapping manufacturer2...'
java -jar ../rmlmapper.jar -m mapping.rml.ttl -d
cd ../manufacturer3
echo 'Executing mapping manufacturer3...'
java -jar ../rmlmapper.jar -m mapping.rml.ttl -d
The content of the Solid pods after the executing of the RML+Solid pipeline can be inspected easily in the backend of the Community Solid Server: ./CommunitySolidServer/pods/manufacturer1, ./CommunitySolidServer/pods/manufacturer2, and ./CommunitySolidServer/pods/manufacturer3.
The content of Solid resources can be accessed with standardized HTTP GET requests.
However, to access permissioned resources (i.e. resources without public access), authentication is needed before execution the HTTP request.
For this demonstrator, we provide a JavaScript application, .CSS-Getter, facilitating the execution of authenticated HTTP GET requests to a Community Solid Server.
To install the dependencies for this application run:
cd ./CSS-Getter
npm i
RML secures the semantic interoperability, mapping the source data to any RDF ontology to secure a common understanding of the data.
Manufacturer1 maps source files ./manufacturer1/products.csv and ./manufacturer1/products2.json to resource http://localhost:3000/manufacturer1/products.
Manufacturer2 maps source file ./manufacturer2/articles.xml to resource http://localhost:3000/manufacturer2/articles.
Manufacturer3 maps source files ./manufacturer3/products.csv and ./manufacturer3/products2.json to resourcehttp://localhost:3000/manufacturer3/products.
Three manufacturers map their source data to the same ontology, i.e. they used the same RDF terms to express the concepts and properties.
Manufacturer1 and Manufacturer3 maps their source data additionally to another ontology: products-other-ontology and products-other-ontology.
A Solid resource can be accessed with a HTTP requests. Below example shows how a resource with public read access can be accessed.
curl -H "Accept: application/n-triples" "http://localhost:3000/manufacturer2/articles"
When the access to a resources is restricted to selected users, an authenticated HTTP GET request is needed.
Each manufacturer has access read access to all his resources.
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/manufacturer1/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/products
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/manufacturer1/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/products-a
Other users have access to selected resources, e.g. user1 has read access to http://localhost:3000/manufacturer1/product-10001
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/user1/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/product-10001
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/user4/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/product-10001-1
The manufacturers keep an overview of the access rights to the resources on their Solid pods in a locally stored CSV file, e.g. ./manufacturer1/data/read_access.csv RML maps these data of CSV files to Access Control List (ACL) rules. Linked HTTP requests allow RML to published these ACL rules as an ACL resource on the manufacturer's Solid pod, linked to the resource to which they apply, e.g. http://localhost:3000/manufacturer1//product-10002.acl
The manufacturers manage the read access to the resources on their Solid pods via a locally stored csv file (./manufacturer1/read_access.csv and ./manufacturer2/read_access.csv). This data is mapped to ACL files, and linked to the relevant resources in their Solid pods.
Only users with read access rights can retrieve the content of a resource.
User1 has read access to http://localhost:3000/manufacturer1/product-10002
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/user1/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/product-10002
User2 has no access to http://localhost:3000/manufacturer1/product-10002
node ./CSS-Getter/getResource.js [email protected] password=abc123 webId=http://localhost:3000/user2/profile/card#me oidcIssuer=http://localhost:3000/ absoluteURI=http://localhost:3000/manufacturer1/product-10002
The data can be exposed in any granularity. We included following five examples in our setup:
- one resource including all properties of all products, e.g. http://localhost:3000/manufacturer1/products
- two resources, dividing the information of all products based on two categories of product properties, e.g. http://localhost:3000/manufacturer1/products-a
- one resource per product, including all properties of that product, e.g. http://localhost:3000/manufacturer1/product-10001
- two resources per product, dividing the product information based on two categories of product properties, e.g. http://localhost:3000/manufacturer1/product-10001-a
- one resource per property per product http://localhost:3000/manufacturer1/product-10001-1
Our demo includes examples of overlapping views:
- http://localhost:3000/manufacturer1/products;
- http://localhost:3000/manufacturer1/product-10001;
- http://localhost:3000/manufacturer1/product-10001-1.
Our demo includes examples of disjoint views:
- http://localhost:3000/manufacturer1/products-a versus http://localhost:3000/manufacturer1/products-b;
- http://localhost:3000/manufacturer1/product-10001 versus http://localhost:3000/manufacturer1/product-10002;
- http://localhost:3000/manufacturer1/product-10001-1 versus http://localhost:3000/manufacturer1/product-10001-2.
RML handles heterogeneous source data, with logical sources to describe the access to the sources, and expressions to extract the values from the sources.
The source data of manufacturer1 and of manufacturer 3 is spread over two files, with heterogeneous file formats (./manufacturer1/products.csv and ./manufacturer1/products2.json, ./manufacturer3/products.csv and ./manufacturer3/products2.json) with heterogeneous labels (e.g. ProductID and product_id).
Manufacturer2 has one source file (./manufacturer2/articles.xml) with his own labels (e.g. articlenumber).
Once the RML mapping is created, the RML+Solid pipeline generates the RDF data and the corresponding resources on the manufacturer's Solid pod with one command: java -jar [path-to-rmlmapper.jar] -m [path-to-rmlmapping] -d
.
Updates in the source data are handled with a new pipeline run.
In our demonstrator the third manufacturer has organized his source data in a similar way as the first manufacturer. He reuses the RML mapping of the first manufacturer, updating it only with the base URI of his Solid pod, i.e. and with his authentication info.
Morph-LDP | LDP-DL | RML+SOLID | |
---|---|---|---|
R1 Semantic interoperability | ✓ | ✓ | ✓ |
R2 Technical interoperability: read access | ✓ | ✓ | ✓ |
R3 Access control | ✓ | ||
R4 Flexible design | ✓ | ✓ | |
R5 Heterogeneous data sources | ✓ | ✓ | |
R6 Automated generation | ✓ | ✓ | ✓ |
R7 Reusable design | ✓ | ✓ | ✓ |
R8 On-the-fly-generation | ✓ | ✓ | |
R9 Technical interoperability: write access | ✓ |