Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of Triple? A case for the misleading class #124

Closed
blake-regalia opened this issue Mar 15, 2018 · 42 comments · Fixed by #153
Closed

Get rid of Triple? A case for the misleading class #124

blake-regalia opened this issue Mar 15, 2018 · 42 comments · Fixed by #153

Comments

@blake-regalia
Copy link
Contributor

Right now, Triple is an alias for Quad with its .graph property set to an instance of a DefaultGraph. The factory method .quad() can be called without supplying the optional final graph argument, in which case the .graph property defaults to the default graph. This makes the factory method .triple() redundant, but more importantly, misleading.

At first glance this seems trivial, but if I want to compare two triples from different graphs, then I would expect an instance of Triple to only compare subject, predicate and object. Without the distinction between triple and quad, implementations cannot extend the interface to allow for, say, extracting an actual triple from one graph and comparing it to a triple from another graph.

One solution would be getting rid of Triple entirely from the spec, including the factory method, and treating everything as a Quad. The other option would be to make the distinction that Triple is its own class that extends Quad. The latter of which is more supportive of implementations imo.

@bergos
Copy link
Member

bergos commented Mar 15, 2018

The .triple method alias for .quad is mainly there for people which don't know the concept of quads. I would expect these people are working only with quads in the default graph. So for them it should not be a problem. We could discuss if we really need that alias for the mentioned reason, because it's a low level API and other libraries on top would be used to handle the triples.

I don't think it's right that .equals returns true for triple.equals(quad) when the quad has a named graph. The right why to do it would be: triple.equals(rdf.quad(quad.subject, quad.predicate, quad.object)). That's a little bit long and ugly, so libraries may wrap it somehow. In rdf-ext there is already a very similar method. The .dataset factory methods accepts any collection of quads with a .forEach method and optional also a graph parameter, which can be used to replace the graph of all given quads.

https://github.com/rdf-ext/rdf-ext/blob/master/lib/DataFactory.js#L62

@pieroit
Copy link
Contributor

pieroit commented Mar 15, 2018

I'm loosing the conversation guys, just here to point out usability: let's keep complexity on the side of the few implementers so the side of the many users got an easy api to deal with.

Peace

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Mar 15, 2018

At first glance this seems trivial, but if I want to compare two triples from different graphs, then I would expect an instance of Triple to only compare subject, predicate and object.

It depends on whether you assume a triple always resides in an RDF graph or not.

I understand your arguments, but ultimately the decision will always be a compromise. If we give up "Triple", newcomers might be surprised not to see a triple in an RDF library. If we alias Triple and Quad, they might be confused about equality. If we don't alias Triple and Quad, what is the difference between a triple and a quad in the default graph?

I personally think the current compromises, Triple = Quad, is the worst of all evils.

@l00mi
Copy link
Member

l00mi commented Mar 15, 2018

Hm, if I remember that decision was just that we said that there is no difference between triple and quad. The case of comparing them might be something which differentiates it?

What is the expectation of comparing two triples? I would say that the graph is not taken into account for the comparison (regardless if its the defaultGraph or some named Graph).

This might be good enough of a reason to create its own class. The next question will be, how much will this cost in performance? @RubenVerborgh @bergos

Finally, in any case, if we keep the behaviour it might be good to add a simple "note" in the spec to warn about this behaviour as I guess @blake-regalia lost some time because of this ..

@RubenVerborgh
Copy link
Member

What is the expectation of comparing two triples? I would say that the graph is not taken into account for the comparison (regardless if its the defaultGraph or some named Graph).

If the only distinction between Triple and Quad is that their comparison does not take graph into account, I'm afraid that this will create many more complexities than just assuming Triple = Quad. The latter is not perfect either, but at least consistent.

Finally, in any case, if we keep the behaviour it might be good to add a simple "note" in the spec to warn about this behaviour

👍

@blake-regalia
Copy link
Contributor Author

Just so we all see the big picture here:

Triple ≠ Quad

Quite plainly, from the RDF spec:

An RDF triple consists of three components:

  • the subject [...]
  • the predicate [...]
  • the object [...]

To assume that a triple has an implicit graph property seems misguided.

Turtle, for example, describes a set of triples that belong to some unknown RDF graph. Conceptually, this graph is purely abstract, and the set of triples is merely bound by the scope of the Turtle file. It's only once this file is read (say by importing it into some triplestore) that its triples become part of some tangible graph (e.g., the default graph).

TriG, on the other hand, describes quads, where any 'Triple Statement' explicitly belongs to the default graph.

.equals()

What would be the expected behavior for comparisons amongst triples and quads?

quad = factory.quad(a, b, c);
triple = factory.triple(a, b, c);

// OPTION 1: always return false
quad.equals(triple);  // false
triple.equals(quad);  // false

// OPTION 2: throw an Error
quad.equals(triple);  // TypeError: expected arg to be instanceof Quad
triple.equals(quad);  // TypeError: expected arg to be instanceof Triple


// THE PROPER WAY: cast to the other type for comparison
quad.toTriple().equals(triple);  // true
triple.toQuad(quad.graph).equals(quad);  // true

Searching & Querying

What would be the expected behavior when a user obtains an instance of a Triple?

/* contents of `quadstore`:
<a> <b> <c> .
<g> {
  <x> <y> <z> .
}
*/
triple_abc = factory.triple(a, b, c);
triple_xyz = factory.triple(x, y, z);
quad_abc = factory.quad(a, b, c);
quad_xyz = factory.quad(x, y, z);

// OPTION 1: search all graphs given a triple
quadstore.has(triple_abc);  // true
quadstore.has(triple_xyz);  // true
quadstore.has(quad_abc);  // true
quadstore.has(quad_xyz);  // false (default graph != <g>)

// OPTION 2: throw an Error given a triple
quadstore.has(triple_abc);  // TypeError: expected arg to be instanceof Quad
quadstore.has(triple_xyz);  // TypeError: expected arg to be instanceof Quad
quadstore.has(quad_abc);  // true
quadstore.has(quad_xyz);  // false (default graph != <g>)


// ANOTHER WAY: fetch a subset
quadstore.graph(g).has(triple_abc);  // false
quadstore.graph(g).has(triple_xyz);  // true

Conclusion

Making Triple distinct from Quad would need further discussion as to what is the expected behavior starting with the scenarios above.

Getting rid of Triple would leave its implementation up for interpretation, which is not good for cross-compatibility.

Finally, I hope we can agree that it is in our best interest to not always sacrifice accuracy for simplicity. Keeping Triple an alias of Quad just to make the spec appear less scary seems shortsighted. Consider how an implementation would have to differentiate its own version of a triple from the spec's just to meet the expected behavior of comparison, etc.

@RubenVerborgh
Copy link
Member

Triple ≠ Quad

To assume that a triple has an implicit graph property seems misguided.

Meh. That same spec says:

An RDF dataset is a collection of RDF graphs, and comprises:

Exactly one default graph, being an RDF graph. The default graph does not have a name and MAY be empty.
Zero or more named graphs. Each named graph is a pair consisting of an IRI or a blank node (the graph name), and an RDF graph. Graph names are unique within an RDF dataset.

So triples in a dataset are always in a graph, either the default one or a named one.

And "quad" is just a different view of "triple in a graph". So either you say that <s> <p> <o> is a triple in <g> (which is how TriG does it), or that we have a quad <s> <p> <o> <g> (which is N-Quads). Same thing, different way of viewing it.

Turtle, for example, describes a set of triples that belong to some unknown RDF graph.

Do you have support for that statement? The spec says:

If an RDF dataset is returned and the consumer is expecting an RDF graph, the consumer is expected to use the RDF dataset's default graph.

@blake-regalia
Copy link
Contributor Author

So triples in a dataset are always in a graph, either the default one or a named one.

Yes, the main point being "in a dataset", i.e., within a given context. A 'pure' triple (as analogous to pure functions) is without context: <s> <p> <o>.

And "quad" is just a different view of "triple in a graph". So either you say that <s> <p> <o> is a triple in <g> (which is how TriG does it), or that we have a quad <s> <p> <o> <g> (which is N-Quads). Same thing, different way of viewing it.

Absolutely, and the distinction remains the same: triple ≠ quad.

If an RDF dataset is returned and the consumer is expecting an RDF graph, the consumer is expected to use the RDF dataset's default graph.

In other words, the consumer assigns those triples to the default graph by virtue of content negotiation.

If the proposition were true: Turtle describes a set of triples that belongs to the default graph, this would mean that importing a Turtle file into a named graph would be considered bad practice, but we know this is not the case. Turtle is merely a serialization of an RDF graph's contents, and carries no additional context (i.e., the identity of a graph). Therefore, Turtle describes a set of triples that belong to some unknown RDF graph.

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Mar 15, 2018

Yes, the main point being "in a dataset", i.e., within a given context.

Agreed.

However, this creates the inconvenient situation that something is a Triple, until we start considering it as part of a dataset, then it becomes a Quad. When does this transition happen and how?

Honestly, seems far more easy to just assume that a Triple always belongs to a dataset. Just ignore the dataset part when not needed.

If the proposition were true: Turtle describes a set of triples that belongs to the default graph, this would mean that importing a Turtle file into a named graph would be considered bad practice, but we know this is not the case.

That conclusion doesn't necessarily hold, i.e., it's not because Turtle would be default graph, that moving things to a named graph would be bad practice.

That said, honestly, I always assumed that Turtle files just put things in the default graph, i.e., always interpreted Turtle as a TriG subset. (The Turtle spec doesn't mention graphs at all, so I can't find support for either mine or your way.)

Absolutely, and the distinction remains the same: triple ≠ quad.

Hmm, my argument was that triples and quads are just two equivalent views of the exact same thing.

Turtle is merely a serialization of an RDF graph's contents, and carries no additional context (i.e., the identity of a graph). Therefore, Turtle describes a set of triples that belong to some unknown RDF graph.

Okay, but then it just seems easier to pass an optional graph argument to the parser, in which it puts the triples after having parsed a document. Can default to either default graph (which is what I would do) or to the graph of the parsed document (which is what rdflib does).

Having two possible graph states (default / named) seems easier than having to deal with three possible graph states (unknown / default / named). Plus we get the convenient Triple / Quad equivalence.

How do non-JavaScript RDF libs solve this BTW?

@blake-regalia
Copy link
Contributor Author

And "quad" is just a different view of "triple in a graph".

i.e., quad = triple + graph.

The Turtle spec doesn't mention graphs at all

Agreed. This is open to interpretation. Still, it has made for an interesting discussion 👍

Okay, but then it just seems easier to pass an optional graph argument to the parser, in which it puts the triples after having parsed a document.

I think we are mostly on the same page -- I just want to clarify that I am not suggesting the parsers should emit 'pure' triples that have an unknownGraph property (although i would be open to the idea). My qualm is with the factory method. The fact that .triple() creates a quad with the defaultGraph is misleading. This is what I was hoping to demonstrate in the examples.

@elf-pavlik
Copy link
Member

How do non-JavaScript RDF libs solve this BTW?

maybe @gkellogg could share with us his experience based on ruby gems he maintains?

also @sandhawke and @ericprud might have some helpful input to this issue

If an RDF dataset is returned and the consumer is expecting an RDF graph, the consumer is expected to use the RDF dataset's default graph.

I think this note comes as consequence of JSON-LD not having a way to content negotiate for single graph or the dataset. Here @gkellogg and @sandhawke might have best memory of that resolution json-ld/json-ld.org#182 (comment)

@gkellogg
Copy link
Member

In RDF.rb, Triples and Quads May be directly compared. A Repository stores quads, with a special designator for the default graph. Any graph within a repository may be projected into a Graph object. Triples emit from a graph, with no associated graph name.

I don’t understand the bit about content negotiation with JSON-LD. JSON-LD is effectively a quad format. Note that framing does allow you to filter on a particular graph, and we will likely provide a means of using an HTTP header to identify such a frame.

@ericprud
Copy link

ericprud commented Mar 19, 2018

One handy behavior I can see would be that the triple prototype for equals ignores the graph. Here's and example (using <uri> to stand for { termType: 'NamedNode', value: uri }):

x1 = Quad(<s1>, <p1>, <o1>, DeffaultGraph)
x2 = Triple(<s1>, <p1>, <o1>)
x3 = Quad(<s1>, <p1>, <o1>, <g1>)
x1.equals(x1) => true
x1.equals(x2) => false
x2.equals(x1) => true
x1.equals(x3) => false
Triple(x1).equals(Triple(x3)) => true
Triple(x1).equals(x3) => true
  • Triple constructor takes three args or a Quad, from which it strips the graph.
  • Triple.equals takes a Triple or a Quad which it promotes (demotes?) to a Triple.
  • Triple.graph = undefined

I think this would give naive users an intuitive triples interface as well as have a principled distinction between triples and quads which more expert users would find sensible and memorable.

@RubenVerborgh
Copy link
Member

You call it handy, I call it terribly confusing 😄 Especially reckoning with the fact that JavaScript is untyped, so you'd get x1, x2, x3 as parameters in a method, and we'd see:

x1.equals(x2) => false
x2.equals(x1) => true

And nobody would understand what the heck is going on… 😉

@ericprud
Copy link

This sort of asymmetry is reasonably common in languages with automatic type promotion. Comparing things as Triples is more lax than comparing them as Quads.

@RubenVerborgh
Copy link
Member

Sure, but it is confusing nonetheless.

@bergos
Copy link
Member

bergos commented Mar 19, 2018

I would be interested in real world use cases, because I think:

  • most of the time you have Quads with a DefaultGraph when you want to handle Triples
  • in the few other use cases it would be acceptable to explicitly set the DefaultGraph

That's my experience of working with the current behavior for 9 months.

@elf-pavlik
Copy link
Member

I think we should make sure to check with everyone who have already implemented RDF/JS interface. Possibly @jacoscaz and @dlongley might also want to chime in how any of suggested in this thread changes would affect their implementations.

@dlongley
Copy link

Triples aren't really used in any of the implementations I've worked on as they are primarily related to dataset canonicalization or to JSON-LD (which, as @gkellogg said, is effectively a quads format). After reflecting on that and the comments in this thread, it seems to me that if Triple is to have utility, then it should be considered to be totally lacking a graph component rather than having one (e.g. the default graph) by implication.

In other words, the design for Triple should follow its utility/use cases. If people are using Triple to specifically ignore membership in a particular graph (which is also conceptually congruent) then I'd expect that they really have no such component and @ericprud's comment makes a lot of sense. Users of the interface would need to understand this. Hopefully they would in fact expect this if they were using it to begin with -- because that's what it's for. There's no reason to use it otherwise, right?

@jacoscaz
Copy link
Contributor

jacoscaz commented Sep 5, 2018

I am catching up on a lot of RDF/JS conversation I have missed.

I understand that a compromise must be made. I personally find it conceptually easier and much cleaner to think of triples as quads living within the default graph, and to generally think of a triple as an incomplete view of a quad. I am wary of modeling for pure triples. Assume we have a dedicated Triple class. How would we store instances of it? Could instances of Triple be handled together with instances of Quad (i.e. arrays and streams containing both)? This could become fairly complicated. That said, I do work on something called quadstore so I guess my perspective might be a little biased.

However, I understand the need to compare the triple part of quads and I think the best way to address that would be a dedicated comparison method in addition to .equals() or a flag for .equals() itself.

I would personally find anything resulting in the example made by Ruben to be an attempt in scaring people away from RDF/JS by means of introducing directionality in equality comparisons.

By the way, it feels to me like this discussion should be informed by #117.

@elf-pavlik
Copy link
Member

elf-pavlik commented Sep 5, 2018

The .triple method alias for .quad is mainly there for people which don't know the concept of quads. I would expect these people are working only with quads in the default graph. So for them it should not be a problem. We could discuss if we really need that alias for the mentioned reason, because it's a low level API and other libraries on top would be used to handle the triples.

I think I would agree with @bergos that higher lever APIs could add some convenience interfaces for triples. In the spec we could remove all that aliasing and just add a NOTE which suggest to handle any triple as quad with graph: factory.defaultGraph()

In RDF.rb, Triples and Quads May be directly compared. A Repository stores quads, with a special designator for the default graph. Any graph within a repository may be projected into a Graph object. Triples emit from a graph, with no associated graph name.

@gkellogg in your experience with RDF.rb have you seen use case which requires distinguishing a Triple from a Quad with exactly the same s, p, o and default graph designation?

@jacoscaz thank you for referencing #117, I think that we can not all think of default graph in the same way unless we address that issue!

@jacoscaz
Copy link
Contributor

jacoscaz commented Sep 6, 2018

I guess there's a difference between

a) the graph that is queried by default; and
b) the graph to which triples (as they stand today, i.e. quads without an explicit graph term) belong to.

Both can be seen as a form of default graph but they're clearly not the same thing. The first one should, IMHO, most definitely be the union graph. I echo @timbl 's comment. However, the second one still has to be an actual graph.

I think I would agree with @bergos that higher lever APIs could add some convenience interfaces for triples. In the spec we could remove all that aliasing and just add a NOTE which suggest to handle any triple as quad with graph: factory.defaultGraph()

Agreed! Though I would leave the option to simply leave the graph unspecified and have quad() fill that with defaultGraph(). I would also change the specs to clarify which graph (default graph or union graph) is targeted by .match() if not explicitly specified. My vote goes to the union graph.

@rubensworks
Copy link
Member

I would also change the specs to clarify which graph (default graph or union graph) is targeted by .match() if not explicitly specified. My vote goes to the union graph.

I agree. In Comunica, we consider a falsy, blank node or variable graph parameter in .match() as the union of all graphs. This seems to work out quite well. Afterwards, when quads are defined, the concept of a union graph is not needed anymore, only the default graph.

@k00ni
Copy link
Contributor

k00ni commented Nov 22, 2018

Did you reached a compromise here? If so, it would be great if someone could summarize it.

I am asking, because i am interested in the implications of an implementation of the specification (ref #130).

Thanks in advance

@k00ni
Copy link
Contributor

k00ni commented Nov 27, 2018

I think its also important to mention, that DataFactory::triple forces a Triple to have DefaultGraph as value for the graph parameter. That means, you can not create instances of Triple with DataFactory, that have an unset graph.

http://rdf.js.org/#datafactory-interface

@elf-pavlik
Copy link
Member

Spec mentions it in the section you linked to:

triple() returns a new instance of Quad with graph set to DefaultGraph.

Do you see need to add explicit statement that it MUST NOT leave graph undefined?

@k00ni
Copy link
Contributor

k00ni commented Nov 27, 2018

@elf-pavlik wrote:

Do you see need to add explicit statement that it MUST NOT leave graph undefined?

In my opinion, its not necessary to mention that explicitly, because the rdfjs is clear enough what to expect. All fine here. 👍

But ...

I have a different point of view on the whole Triple issue, based on the experience with my PHP implementation of the rdfjs specification. (My implementation of the Data Interfaces is class based, mostly. There are no interfaces. The reason is, that you can use them more easily.) Nevertheless, if you allow Triple instances to exist in general, you also should allow their creation, whenever its suitable. Having a function called triple, which in fact returns an instance of Quad, doesn't seem that intuitive (from a point of view as a developer). I would expect not only that the return is an instance of Triple, but also to be able to set the quad parameter by myself. There should be the same freedom as if i create the class myself.

My proposal

  1. triple function: DefaultGraph should not be forced to be the default for the graph parameter. Default for graph should be null or a valid instance of Term (reusing the same rules as for the quad parameter in Quad).
  2. DataFactory should be able to create instances of ALL classes/instances you have in the portfolio. Therefore triple must return an instance of type Triple

I'd like to be able to use triples without any graph info.

What do you think?

@elf-pavlik
Copy link
Member

I think again this relates to #117 and if implementations should consider an instance of Quad with undefined or null graph as equal to quad with graph set to an instance of DefaultGraph

What do you think about this comment above #124 (comment)

@k00ni
Copy link
Contributor

k00ni commented Nov 28, 2018

@elf-pavlik wrote:

I think again this relates to #117 and if implementations should consider an instance of Quad with undefined or null graph as equal to quad with graph set to an instance of DefaultGraph

Lets have some pseudo code:

var t = new Triple (s, p, o);

Triple t has no graph information, therefore should t.graph == null be true. If the default for graph is an instance of DefaultGraph, you would assume something, the developer may not be aware of or don't want.

I'd like to leave that to the developer. For instance, if you want all triples to be part of the DefaultGraph, one can do that. If you don't care about quads in general, one only uses Triple without graph information and can do that too. This approach should be valid with the RDF specification.

@RubenVerborgh wrote:

You call it handy, I call it terribly confusing. Especially reckoning with the fact that JavaScript is untyped, so you'd get x1, x2, x3 as parameters in a method, and we'd see:

x1.equals(x2) => false
x2.equals(x1) => true

And nobody would understand what the heck is going on…

This is about the equals function used to compare a Triple with a Quad. In JavaScript you may need a different approach to implement that than for instance with PHP (which allows you to set a parameter type).

Therefore, i would like to propose a small change in the specification. Instead of

equals() returns true if and only if the argument is a) of the same type b) has all components equal.

I would rephrase it to:

equals() returns true if and only if the argument other is a) an instance of Quad b) has all components equal.

(a) includes also the case that other is an instance of Triple.

@blake-regalia
Copy link
Contributor Author

blake-regalia commented Jan 21, 2019

This was recently brought up again on the call and here are some of the main points that were mentioned about this topic:

  • The semantics of triple imply that there is no graph component; however the factory returns a quad with a graph. This is not a triple.
  • Aliases do not belong in specs.
  • The lack of a .triple method in the spec is not that confusing; a simple note about quads and triples would suffice.
  • If factory implementations want to have their own .triple() function, should they be allowed to do so and what about interoperability?

@RubenVerborgh
Copy link
Member

* Aliases do not belong in specs.

That's not an argument we should weigh; I could similarly say that "JavaScript specs do not define an equal method for triples". The question is: does it make sense for us to do so?

@blake-regalia
Copy link
Contributor Author

That's not an argument we should weigh;

Sorry but I think my concise bullet point here does not do justice to the full argument that was made on the call and that @vhf can elaborate.

@elf-pavlik
Copy link
Member

elf-pavlik commented Jan 21, 2019

I lean towards removing Triple and .triple(), clarifying that if one omits optional graph in .quad() it will assign instance of DefaultGraph as default value and just make a note that one can create 'triple' this way.

This section in RDF1.1 spec suggests to me that we can consider all triples as 'in default graph' of the dataset: https://www.w3.org/TR/rdf11-concepts/#section-dataset-conneg

Web resources may have multiple representations that are made available via content negotiation [WEBARCH]. A representation may be returned in an RDF serialization format that supports the expression of both RDF datasets and RDF graphs. If an RDF dataset is returned and the consumer is expecting an RDF graph, the consumer is expected to use the RDF dataset's default graph.

Having instance of triple which has undefined graph to have equals() with instance of quad with same s, p, o and g: instance of DefaultGraph returning false makes no sense to me and very likely can lead to unexpected behavior of applications.

@awwright
Copy link
Member

In my view the Quad serves an entirely different purpose than Triple: to say a specific Triple exists in a given graph. There's reasons to have both, and use one instead of the other. Sometimes I want to be able to talk about a particular Triple without implying membership in a graph.

This is important, because when we use an IRI to name things, it implies universal uniqueness. So if we're going to adopt Quad as the only way to talk about RDF statements, that means all of our applications have to agree on what URIs to give graphs. This isn't always desirable.

Sometimes I just want to talk about two or more graphs without giving them any unique name. Say, I want to read two Turtle documents and diff them. Sure we can create multiple Datasets and put everything into the "default graph", but this muddies the semantics. What does it mean to test if two datasets are isomorphic, for example? I have to add an additional test case in my application to ensure there's no named Quads in my data—a feature I could have gotten for free if I was just using Triple. (And what if I want to name my Datasets, is it turtles all the way down?)

Only using Quad changes who manages the URI/IRI namespaces for graphs. In a normal Turtle parser, I get a stream of Triples and get to decide which graph to put it into, if at all. But if I get back a Quad, that decision must be made by the parser, and if it's the wrong one, I have to change it. This invites bugs, at best.

As far as I'm aware, the Quad and named graph was a concept invented by SPARQL to serve as a substitute for files on a filesystem, or multiple Graph structures in application memory. But these limitations don't exist in most languages, so I don't think it makes sense to confine ourselves to the same.

@RubenVerborgh
Copy link
Member

RubenVerborgh commented Jan 21, 2019 via email

@elf-pavlik
Copy link
Member

elf-pavlik commented Jan 22, 2019

that means all of our applications have to agree on what URIs to give graphs

.quad() should default to instance of DefaultGraph for graph

In a normal Turtle parser, I get a stream of Triples and get to decide which graph to put it into, if at all. But if I get back a Quad, that decision must be made by the parser, and if it's the wrong one, I have to change it.

I think currently turtle parser would always assign an instance of DefaultGraph to graph, Source interface doesn't provide a way to set it, unless we go with #44 'options' parameter.
@RubenVerborgh and @blake-regalia in your turtle parsers would you see options parameter which could include IRI which parser would use for graph value in emitted quads?

Say, I want to read two Turtle documents and diff them. Sure we can create multiple Datasets and put everything into the "default graph", but this muddies the semantics.

How do you imagine having two distinct graph in one dataset without using named graphs?

@awwright
Copy link
Member

Sure, but note N-Quads and TriG have additional considerations compared to Triple-based formats like Turtle:

In N-Triples and Turtle, the graph is whatever I say it is (usually, it's going to be the URI I downloaded the file at).

In N-Quads and Trig, the graph is whatever the document says it is. If I'm importing a TriG document, I might need to do additional checking to verify the entity uploading the document is an authority for each of the named graphs it mentions.

These are valid, but very different, use cases that (I believe) demonstrates the need for a Triple that is not also a Quad.

While there's a need for Quad and Dataset, I think offered a choice of both, most data exchange would happen in a Triple format. Most popular data formats are even simpler than that, and happen in a single acyclic tree. (Think JSON vs. YAML vs. XML.) I can testify, in my work for JSON Schema, the biggest problem we have is explaining references (which are named with URIs, and allow cyclical references).

@elf-pavlik
Copy link
Member

AFAIK this section I quoted above came out of JSON-LD work, where representation can have just one (default) graph or dataset with many named graphs and no more than one default graph.
https://www.w3.org/TR/rdf11-concepts/#section-dataset-conneg


In N-Quads and Trig, the graph is whatever the document says it is. If I'm importing a TriG document, I might need to do additional checking to verify the entity uploading the document is an authority for each of the named graphs it mentions.

I think application would do that while receiving quads from Source interface (eg. a parser) but wouldn't expect the source itself to handle it in any way.

In N-Triples and Turtle, the graph is whatever I say it is (usually, it's going to be the URI I downloaded the file at).

If representation of dataset includes a default graph, on can decide to name it just as one can decide to name a graph when representation only includes a graph.

It makes a lot of sense to me to just consider representations in Turtle, N-Triples, RDFa and RDF/XML as equivalent to representations in Trig, N-Quads and JSON-LD which only include default graphs and no named graphs.

Once again, based on https://www.w3.org/TR/rdf11-concepts/#section-dataset-conneg it seems that default graph on Trig representation should get treated exactly the same way as Tutle representation when we get them by de-referencing the same IRI.

@bergos
Copy link
Member

bergos commented Jan 22, 2019

It's always possible to do further stream processing to set the graph of a Quad coming from a parser where there is no graph defined and DefaultGraph is used.

Having a Triple interface which is different to a Quad interface makes things just very complicated. In the call nobody disagreed about that. So the main question was, if we should keep the Triple alias and the .triple method in the spec or remove it. I have the feeling that the absence of .triple will confuse people, so I'm for keeping. But if we decide that we remove it, I would keep the .triple method till the next major version and write a warning to the console. I don't think it's a good idea that some libraries/factories have the .triple and others don't. That would be a bad experience for interoperability.

Let's vote on this comment till Sunday (2019-01-27). 👍 for keeping .triple 👎 for removing it.

If somebody has another proposal, please leave a comment and if it gets a sufficient number of votes, we can also discuss it.

@RubenVerborgh
Copy link
Member

👀 = no strong feelings either way

@rubensworks
Copy link
Member

Another reason for removal: .triple violates the DRY principle on interface-level, as it offers you two ways (.quad) of doing the same thing.

@elf-pavlik
Copy link
Member

I've made PR #153

@elf-pavlik elf-pavlik added this to the data-model-spec milestone Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.