-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get rid of Triple
? A case for the misleading class
#124
Comments
The I don't think it's right that https://github.com/rdf-ext/rdf-ext/blob/master/lib/DataFactory.js#L62 |
I'm loosing the conversation guys, just here to point out usability: let's keep complexity on the side of the few implementers so the side of the many users got an easy api to deal with. Peace |
It depends on whether you assume a triple always resides in an RDF graph or not. I understand your arguments, but ultimately the decision will always be a compromise. If we give up "Triple", newcomers might be surprised not to see a triple in an RDF library. If we alias Triple and Quad, they might be confused about equality. If we don't alias Triple and Quad, what is the difference between a triple and a quad in the default graph? I personally think the current compromises, Triple = Quad, is the worst of all evils. |
Hm, if I remember that decision was just that we said that there is no difference between What is the expectation of comparing two triples? I would say that the This might be good enough of a reason to create its own class. The next question will be, how much will this cost in performance? @RubenVerborgh @bergos Finally, in any case, if we keep the behaviour it might be good to add a simple "note" in the spec to warn about this behaviour as I guess @blake-regalia lost some time because of this .. |
If the only distinction between Triple and Quad is that their comparison does not take graph into account, I'm afraid that this will create many more complexities than just assuming Triple = Quad. The latter is not perfect either, but at least consistent.
👍 |
Just so we all see the big picture here: Triple ≠ QuadQuite plainly, from the RDF spec:
To assume that a triple has an implicit graph property seems misguided. Turtle, for example, describes a set of triples that belong to some unknown RDF graph. Conceptually, this graph is purely abstract, and the set of triples is merely bound by the scope of the Turtle file. It's only once this file is read (say by importing it into some triplestore) that its triples become part of some tangible graph (e.g., the default graph). TriG, on the other hand, describes quads, where any 'Triple Statement' explicitly belongs to the default graph.
|
Meh. That same spec says:
So triples in a dataset are always in a graph, either the default one or a named one. And "quad" is just a different view of "triple in a graph". So either you say that
Do you have support for that statement? The spec says:
|
Yes, the main point being "in a dataset", i.e., within a given context. A 'pure' triple (as analogous to pure functions) is without context:
Absolutely, and the distinction remains the same: triple ≠ quad.
In other words, the consumer assigns those triples to the default graph by virtue of content negotiation. If the proposition were true: Turtle describes a set of triples that belongs to the default graph, this would mean that importing a Turtle file into a named graph would be considered bad practice, but we know this is not the case. Turtle is merely a serialization of an RDF graph's contents, and carries no additional context (i.e., the identity of a graph). Therefore, Turtle describes a set of triples that belong to some unknown RDF graph. |
Agreed. However, this creates the inconvenient situation that something is a Triple, until we start considering it as part of a dataset, then it becomes a Quad. When does this transition happen and how? Honestly, seems far more easy to just assume that a Triple always belongs to a dataset. Just ignore the dataset part when not needed.
That conclusion doesn't necessarily hold, i.e., it's not because Turtle would be default graph, that moving things to a named graph would be bad practice. That said, honestly, I always assumed that Turtle files just put things in the default graph, i.e., always interpreted Turtle as a TriG subset. (The Turtle spec doesn't mention graphs at all, so I can't find support for either mine or your way.)
Hmm, my argument was that triples and quads are just two equivalent views of the exact same thing.
Okay, but then it just seems easier to pass an optional Having two possible graph states (default / named) seems easier than having to deal with three possible graph states (unknown / default / named). Plus we get the convenient Triple / Quad equivalence. How do non-JavaScript RDF libs solve this BTW? |
i.e., quad = triple + graph.
Agreed. This is open to interpretation. Still, it has made for an interesting discussion 👍
I think we are mostly on the same page -- I just want to clarify that I am not suggesting the parsers should emit 'pure' triples that have an |
maybe @gkellogg could share with us his experience based on ruby gems he maintains? also @sandhawke and @ericprud might have some helpful input to this issue
I think this note comes as consequence of JSON-LD not having a way to content negotiate for single graph or the dataset. Here @gkellogg and @sandhawke might have best memory of that resolution json-ld/json-ld.org#182 (comment) |
In RDF.rb, Triples and Quads May be directly compared. A Repository stores quads, with a special designator for the default graph. Any graph within a repository may be projected into a Graph object. Triples emit from a graph, with no associated graph name. I don’t understand the bit about content negotiation with JSON-LD. JSON-LD is effectively a quad format. Note that framing does allow you to filter on a particular graph, and we will likely provide a means of using an HTTP header to identify such a frame. |
One handy behavior I can see would be that the triple prototype for x1 = Quad(<s1>, <p1>, <o1>, DeffaultGraph)
x2 = Triple(<s1>, <p1>, <o1>)
x3 = Quad(<s1>, <p1>, <o1>, <g1>)
x1.equals(x1) => true
x1.equals(x2) => false
x2.equals(x1) => true
x1.equals(x3) => false
Triple(x1).equals(Triple(x3)) => true
Triple(x1).equals(x3) => true
I think this would give naive users an intuitive triples interface as well as have a principled distinction between triples and quads which more expert users would find sensible and memorable. |
You call it handy, I call it terribly confusing 😄 Especially reckoning with the fact that JavaScript is untyped, so you'd get
And nobody would understand what the heck is going on… 😉 |
This sort of asymmetry is reasonably common in languages with automatic type promotion. Comparing things as Triples is more lax than comparing them as Quads. |
Sure, but it is confusing nonetheless. |
I would be interested in real world use cases, because I think:
That's my experience of working with the current behavior for 9 months. |
Triples aren't really used in any of the implementations I've worked on as they are primarily related to dataset canonicalization or to JSON-LD (which, as @gkellogg said, is effectively a quads format). After reflecting on that and the comments in this thread, it seems to me that if In other words, the design for |
I am catching up on a lot of RDF/JS conversation I have missed. I understand that a compromise must be made. I personally find it conceptually easier and much cleaner to think of triples as quads living within the default graph, and to generally think of a triple as an incomplete view of a quad. I am wary of modeling for pure triples. Assume we have a dedicated However, I understand the need to compare the triple part of quads and I think the best way to address that would be a dedicated comparison method in addition to I would personally find anything resulting in the example made by Ruben to be an attempt in scaring people away from RDF/JS by means of introducing directionality in equality comparisons. By the way, it feels to me like this discussion should be informed by #117. |
I think I would agree with @bergos that higher lever APIs could add some convenience interfaces for triples. In the spec we could remove all that aliasing and just add a NOTE which suggest to handle any triple as quad with
@gkellogg in your experience with RDF.rb have you seen use case which requires distinguishing a Triple from a Quad with exactly the same @jacoscaz thank you for referencing #117, I think that we can not all think of default graph in the same way unless we address that issue! |
I guess there's a difference between a) the graph that is queried by default; and Both can be seen as a form of
Agreed! Though I would leave the option to simply leave the graph unspecified and have |
I agree. In Comunica, we consider a falsy, blank node or variable graph parameter in |
Did you reached a compromise here? If so, it would be great if someone could summarize it. I am asking, because i am interested in the implications of an implementation of the specification (ref #130). Thanks in advance |
I think its also important to mention, that |
Spec mentions it in the section you linked to:
Do you see need to add explicit statement that it MUST NOT leave |
@elf-pavlik wrote:
In my opinion, its not necessary to mention that explicitly, because the rdfjs is clear enough what to expect. All fine here. 👍 But ...I have a different point of view on the whole My proposal
I'd like to be able to use triples without any graph info. What do you think? |
I think again this relates to #117 and if implementations should consider an instance of What do you think about this comment above #124 (comment) |
@elf-pavlik wrote:
Lets have some pseudo code:
Triple I'd like to leave that to the developer. For instance, if you want all triples to be part of the @RubenVerborgh wrote:
This is about the Therefore, i would like to propose a small change in the specification. Instead of
I would rephrase it to:
(a) includes also the case that |
This was recently brought up again on the call and here are some of the main points that were mentioned about this topic:
|
That's not an argument we should weigh; I could similarly say that "JavaScript specs do not define an equal method for triples". The question is: does it make sense for us to do so? |
Sorry but I think my concise bullet point here does not do justice to the full argument that was made on the call and that @vhf can elaborate. |
I lean towards removing This section in RDF1.1 spec suggests to me that we can consider all triples as 'in default graph' of the dataset: https://www.w3.org/TR/rdf11-concepts/#section-dataset-conneg
Having instance of triple which has |
In my view the Quad serves an entirely different purpose than Triple: to say a specific Triple exists in a given graph. There's reasons to have both, and use one instead of the other. Sometimes I want to be able to talk about a particular Triple without implying membership in a graph. This is important, because when we use an IRI to name things, it implies universal uniqueness. So if we're going to adopt Quad as the only way to talk about RDF statements, that means all of our applications have to agree on what URIs to give graphs. This isn't always desirable. Sometimes I just want to talk about two or more graphs without giving them any unique name. Say, I want to read two Turtle documents and diff them. Sure we can create multiple Datasets and put everything into the "default graph", but this muddies the semantics. What does it mean to test if two datasets are isomorphic, for example? I have to add an additional test case in my application to ensure there's no named Quads in my data—a feature I could have gotten for free if I was just using Triple. (And what if I want to name my Datasets, is it turtles all the way down?) Only using Quad changes who manages the URI/IRI namespaces for graphs. In a normal Turtle parser, I get a stream of Triples and get to decide which graph to put it into, if at all. But if I get back a Quad, that decision must be made by the parser, and if it's the wrong one, I have to change it. This invites bugs, at best. As far as I'm aware, the Quad and named graph was a concept invented by SPARQL to serve as a substitute for files on a filesystem, or multiple Graph structures in application memory. But these limitations don't exist in most languages, so I don't think it makes sense to confine ourselves to the same. |
N-Quads and TriG exist too.
|
I think currently turtle parser would always assign an instance of
How do you imagine having two distinct graph in one dataset without using named graphs? |
Sure, but note N-Quads and TriG have additional considerations compared to Triple-based formats like Turtle: In N-Triples and Turtle, the graph is whatever I say it is (usually, it's going to be the URI I downloaded the file at). In N-Quads and Trig, the graph is whatever the document says it is. If I'm importing a TriG document, I might need to do additional checking to verify the entity uploading the document is an authority for each of the named graphs it mentions. These are valid, but very different, use cases that (I believe) demonstrates the need for a Triple that is not also a Quad. While there's a need for Quad and Dataset, I think offered a choice of both, most data exchange would happen in a Triple format. Most popular data formats are even simpler than that, and happen in a single acyclic tree. (Think JSON vs. YAML vs. XML.) I can testify, in my work for JSON Schema, the biggest problem we have is explaining references (which are named with URIs, and allow cyclical references). |
AFAIK this section I quoted above came out of JSON-LD work, where representation can have just one (default) graph or dataset with many named graphs and no more than one default graph.
I think application would do that while receiving quads from
If representation of dataset includes a default graph, on can decide to name it just as one can decide to name a graph when representation only includes a graph. It makes a lot of sense to me to just consider representations in Turtle, N-Triples, RDFa and RDF/XML as equivalent to representations in Trig, N-Quads and JSON-LD which only include default graphs and no named graphs. Once again, based on https://www.w3.org/TR/rdf11-concepts/#section-dataset-conneg it seems that default graph on Trig representation should get treated exactly the same way as Tutle representation when we get them by de-referencing the same IRI. |
It's always possible to do further stream processing to set the Having a Let's vote on this comment till Sunday (2019-01-27). 👍 for keeping If somebody has another proposal, please leave a comment and if it gets a sufficient number of votes, we can also discuss it. |
👀 = no strong feelings either way |
Another reason for removal: |
I've made PR #153 |
Right now,
Triple
is an alias forQuad
with its.graph
property set to an instance of aDefaultGraph
. The factory method.quad()
can be called without supplying the optional final graph argument, in which case the.graph
property defaults to the default graph. This makes the factory method.triple()
redundant, but more importantly, misleading.At first glance this seems trivial, but if I want to compare two triples from different graphs, then I would expect an instance of
Triple
to only compare subject, predicate and object. Without the distinction between triple and quad, implementations cannot extend the interface to allow for, say, extracting an actual triple from one graph and comparing it to a triple from another graph.One solution would be getting rid of
Triple
entirely from the spec, including the factory method, and treating everything as a Quad. The other option would be to make the distinction thatTriple
is its own class that extendsQuad
. The latter of which is more supportive of implementations imo.The text was updated successfully, but these errors were encountered: