Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@comunica/query-sparql-link-traversal-solid does not work when querying someone elses public Pod #57

Closed
Maximvdw opened this issue Apr 21, 2022 · 5 comments

Comments

@Maximvdw
Copy link
Contributor

Maximvdw commented Apr 21, 2022

Queries to public Solid Pods using @comunica/query-sparql-link-traversal will work perfectly fine.

The example can be easily tested using:
https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/default/

with the default "common friends of ..." query.

Issue
Using the exact same query with the exact same pod when using @comunica/query-sparql-link-traversal-solid will result
in no results.

The example can be easily tested using:
https://comunica.github.io/comunica-feature-link-traversal-web-clients/builds/solid-default/

The default query "Common friends..." will not return any results. Regardless on being logged in or not (under the condition that you are not logged in as the owner of the dataset). As the dataset
is public as can be verified from the /default/ example, it is weird that no results are returned with /solid-default/

Expected behaviour
Intuitively I would expect @comunica/query-sparql-link-traversal-solid to behave like @comunica/query-sparql-link-traversal with the option to use the session of a logged in user to access private containers.

@Maximvdw Maximvdw changed the title @comunica/query-sparql-link-traversal-solid does not work when querying someone elses Pod @comunica/query-sparql-link-traversal-solid does not work when querying someone elses public Pod Apr 21, 2022
@rubensworks
Copy link
Member

Yeah, the solid-specific config uses some different algorithms for traversal.
See https://github.com/comunica/comunica-feature-link-traversal/blob/master/engines/config-query-sparql-link-traversal/config/config-default.json vs https://github.com/comunica/comunica-feature-link-traversal/blob/master/engines/config-query-sparql-link-traversal/config/config-solid-default.json

Everything is still very much in flux to figure out what techniques are useful, and what needs to be the "default".

Happy to hear suggestions though.

@Maximvdw
Copy link
Contributor Author

Just to sync my thoughts - what is the reasoning behind the different algorithms instead of using the fetch function from the session to do the traversal with lenient mode?

@rubensworks
Copy link
Member

Just to sync my thoughts - what is the reasoning behind the different algorithms instead of using the fetch function from the session to do the traversal with lenient mode?

The different algorithms (can) also use the authenticated fetch (with lenient mode).

The problem with link traversal is that there's a large number of links that could be followed, and the difficulty lies in the question of what links to follow, and in what order, because this has a huge impact on query performance.

(I recently wrote a blog post about this, in case you're interested in this area: https://www.rubensworks.net/blog/2022/01/21/querying-a-decentralized-web/#the-problems-of-link-traversal)

@Maximvdw
Copy link
Contributor Author

So I assume the idea behind the Solid specific implementation is that you use it to query information 'relevant' to the Pod and do not traverse too much outside the Pod itself (which at the time when you query someone elses Pod would be an issue).

I will close the issue as #52 most likely covers this topic. For those interested; for my personal use case the default link traversal config with a solid base for the fetch from solid seems to do the trick for logged in and logged out scenarios without too much performance issues.

{
    "@context": [
      "https://linkedsoftwaredependencies.org/bundles/npm/@comunica/config-query-sparql/^2.0.0/components/context.jsonld",
      "https://linkedsoftwaredependencies.org/bundles/npm/@comunica/config-query-sparql-link-traversal/^0.0.0/components/context.jsonld"
    ],
    "import": [
      "ccqslt:config/config-solid-base.json",
      "ccqslt:config/extract-links/actors/content-policies-conditional.json",
      "ccqslt:config/extract-links/actors/quad-pattern-query.json",
      "ccqslt:config/rdf-resolve-hypermedia-links/actors/traverse-replace-conditional.json",
      "ccqslt:config/rdf-resolve-hypermedia-links-queue/actors/wrapper-limit-count.json"
    ]
}

@rubensworks
Copy link
Member

So I assume the idea behind the Solid specific implementation is that you use it to query information 'relevant' to the Pod and do not traverse too much outside the Pod itself (which at the time when you query someone elses Pod would be an issue).

Yes, that is the idea (for the moment at least, might change in the future).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants