-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
General query engine interface #5
Comments
Optional second argument, defaulting to |
I would want to implement this interface with m-ld's Javascript engine, so that it can operate in environments that use SPARQL queries directly. I have implemented my own interface, which looks like this: Note the dependency on the sparqlalgebrajs types – I would prefer not to have to pass strings, so there may be a need to have an interface package extracted from sparqlalgebrajs. I would also prefer that the interface allowed a store to be quite explicit about which queries it supports (e.g. Construct but not Describe). |
Does it have to be complicated much? I've been working with @bergos' sparql-http-client and I think's just about right. It comes in two forms, both of which have methods declare module 'sparql-http-client/StreamClient' {
class StreamClient {
query: {
select(query: string): Promise<EventEmitter>
construct(query: string): Promise<import('rdf-js').Stream>
ask(query: string): Promise<boolean>
update(query: string): Promise<void>
}
}
}
declare module 'sparql-http-client/ParsingClient' {
class StreamClient {
query: {
select(query: string): Promise<Array<Record<string, import('rdf-js').Term>>>
construct(query: string): Promise<import('rdf-js').Dataset>
ask(query: string): Promise<boolean>
update(query: string): Promise<void>
}
}
} The only change that I would make above is for the class StreamClient {
query: {
- select(query: string): Promise<EventEmitter>
- construct(query: string): Promise<import('rdf-js').Stream>
+ select(query: string): EventEmitter
+ construct(query: string): import('rdf-js').Stream
}
} |
I think I see the point you're making but doing this would imply leaking aspects of the query language into the query interface. It would be equivalent to having separate methods for |
Database drivers (SDK?) do very much have similar distinction and it's nothing wrong. Think
I'm curious about this statement. In what scenarios is it not known what is the desired kind of result? (tabular, graph or boolean).
What is the nature of this coupling? Arguably, the whole RDF stack is built on uniformity and standards. The RDF graph is the same graph in every software component. SPARQL, being similarly important core standard, can act the same. Otherwise I read you comment as an invitation to build (IMO unnecessary) abstractions |
I've created a new issue to follow up on the discussion of query methods: #6 |
I might actually mention also my lib @tpluscode/sparql-builder. Right now it does rely on |
Ah yes indeed, libs that depend on engines are also relevant to include here. In that respect, the following libs may also be relevant:
|
I'd suggest starting with the SPARQL Algebra (but be willing to depart from it as use cases indicate). You can do a lot with it (like all of SPARQL) but is simpler than SPARQL. For instance, the idiosyncracies of a SolutionModifier's GROUP BY and HAVING reuse aggregation and filter. You may also opt to lop off large parts of it, but having a set of composable operations should be familiar to programmers. |
👍 |
That would actually depend on the outcome of #6. Because if we only expose a single method there, then the return type would vary based on the query. Since query type and return type may be determined async, we may require promises. |
I've had a look at all the suggested libraries, and I've tried to create an overview aspects that I feel may require some standardization. Once we agree upon a list of aspect, we can branch of into separate issues to see how we want to tackle the specifics of each one. 1. Query method interfaceHow to pass a query to a library, and obtain results. Discussion in #6. Single methodAll query forms are handled via a single method, possibly via method overloading or union types. Example:
Implemented by:
Form-based methodsEach query form has its own dedicated method. Example:
Implemented by:
OtherThe following libraries follow another query interface, which seem to be use-case-specific, and may not benefit that much from standardization:
2. Representing bindingsHow to represent the results of tabular queries such as JSON-basedExample of a single bindings object:
Implemented by:
Object-basedA custom datastructure that exposes methods and allows bindings to be stored internally in a different manner. Example of a single bindings object:
Implemented by:
3. Exposing metadataBoth on query-level as on source-level, it may be beneficial to expose metadata such as cardinality (estimates). Dedicated method for obtaining metadata
Implemented by:
Generic object that provides metadata
Implemented by:
4. Serializing resultsA method to serialize query results to a standard format, such as SPARQL JSON results.
Implemented by:
5. Defining sourcesSome engines allow query sources to vary per query execution,
Implemented by:
6. Passing query as algebraInstead of passing a query string to an engine, a (pre-optimized?) algebra object may be passed. Example:
Implemented by:
7. Defining query syntax formatIf engines support different query syntaxes, they typically allow this to be customized via an optional argument. Example:
Implemented by:
|
@rubensworks thank you for this list, very useful. I think as long as we keep our comments short, we might be able to discuss all of these points in this thread without branching into separate issues, which makes it a lot harder to keep track of the general picture IMHO. Of course, we will need to branch out for any point that sparks significant discussion. My preferences.. 1. Query method interfaceDiscussion in #6. My preference goes for single method + return type metadata. 2. Representing bindingsMy preference goes for JSON-based representation (simple objects). 3. Exposing metadataMy preference goes for a generic object that provides metadata. I find that this approach leads to easier and better optimization in terms of sharing computation between metadata and query results. Worth mentioning that this is starting to have significant overlap with the current 4. Serializing resultsI would prefer not to standardize serialization in this spec. 5. Defining sourcesDefinitely in favor of this. 6. Passing query as algebraDefinitely in favor of this. 7. Defining query syntax formatAlso discussed in #6, I have no need for anything else than SPARQL but I defer to people working with multiple query languages on this one. |
Some of my notes from today's call with @gsvarovsky and @rubensworks... 1. Query method interface@gsvarovsky pointed out that the 2. Representing bindings@rubensworks explained the inherent risk of conflicts with native object properties when using bindings representations based on simple javascript objects. We discussed using an object-based representation with instance-level methods strictly related to reading bindings ( Open question: should we keep the 3. Exposing metadataRelated to point 1), we discussed the 4. Serializing resultsWe all agree that serialization should not fall within the scope of this spec. 5. Defining sourcesDefining sources at query time allows query engines to be re-used across sources.This would be best modeled by passing sources as a parameter/option of the main query method: 6. Passing query as algebraWe considered basing the spec around two different query methods, one taking a SPARQL string and the other taking a SPARQL Algebra object. Implementors would be free to implement either/or. 7. Defining query syntax formatWe all agree on keeping this spec SPARQL-based. |
Was there a posting about scheduling a call? Best to keep such things open to the community instead of behind closed doors. |
@blake-regalia There were no formal RDF/JS calls, no. Just some informal talk between @gsvarovsky, @jacoscaz, and myself about the overlaps between our work, and potential alignments. Definitely open to have a call about the query spec, but not sure there is a real need for one at this stage? |
I have done a lot of work with query impls tangential to graphy so i do feel i want to be part of the conversation but haven't had the bandwidth to type up lengthy responses. I would appreciate being part of the discussion over the phone however, just saying.
|
@blake-regalia Of course! |
Just reacting to this tiny nit: I would suggest to drop the question mark. SPARQL already has two syntaxes for variables, one with
|
Please keep the variable interface of the Data Model spec in mind. It should be possible to use variable term objects as identifier in bindings: const bindings = ...
const a = factory.variable('a')
const term = bindings.get(a) Then there is no need to open the leading
|
This is a very sensible consideration IMHO, although I do wonder about the effects on performance (and complexity?) in long chains of transformations. That said, I would be 100% in favor of using object-based representation of variables if at all possible. |
I actually think this should be pretty ok performance-wise. The only downside of this would be that it would be a bit less convenient for interface users to access values of a certain variable. But this is similar to the discussion around #6, as more dev-friendly abstractions can easily be built on top of this. |
FYI, possibility for a call about this on the mailinglist: https://lists.w3.org/Archives/Public/public-rdfjs/2021Oct/0000.html |
PR to extend the discussion to everyone interested at #7 |
We've recently merged #7, which includes and elaborates upon what was discussed in this issue. I think we can close this in favor of more focused issues - @rubensworks final word up to you! |
Sounds good! Let's create new issues where needed based on the experimental interfaces in https://github.com/rdfjs/query-spec/blob/master/queryable-spec.ts Once we're happy, we can create a proper spec. For reference, I've started implementing these interfaces in |
After some internal discussions with @gsvarovsky and @jacoscaz, we identified the need to come up with a base query engine interface for RDF/JS (for declarative queries).
In essence, it should expose an interface that allows you to do operations
such as
const resultStream = await engine.query('some query');
The goal of this issue is to collect input on what already exists, so we can identify what the requirements are for such an interface.
Projects I contribute to that would benefit from this interface:
Big open question for me is how close the relationship to SPARQL should be. (We could for example start off with defining it in terms of SPARQL, but leave room for other query languages)
The text was updated successfully, but these errors were encountered: