Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New section about selective referential transparency #209

Merged
merged 18 commits into from
Oct 1, 2021
Merged

Conversation

hartig
Copy link
Collaborator

@hartig hartig commented Sep 22, 2021

This PR is meant to address issue #202 by adding a new section (Sec.6.4.6) about selective referential transparency. The section includes both i) a concrete proposal how users can indicate that a property is a so-called transparency-enforcing property (i.e., quoted triples are meant to be referentially transparent when used in nested triples with such a property; see example in the new text) and ii) a concrete proposal to define the semantics of such transparency-enforcing properties on top of our RDF-star semantics (by which referential opacity is the default).

(this PR is joint work by @pchampin and me)

/cc @rat10


Preview | Diff

@hartig hartig added semantics About the semantics of RDF-star vocabulary About the vocabulary describing RDF-star elements labels Sep 22, 2021
@afs
Copy link
Collaborator

afs commented Sep 22, 2021

Minor comment: "enforcing" seems to me to imply "only usage". The non-transparency usage is valid, RDF being monotonic.

"permitting"? "allowing"?

Or to keep the E : "enabling"?

@hartig
Copy link
Collaborator Author

hartig commented Sep 22, 2021

Properties identified as TEPs are indeed meant to enforce referential transparency. Note, however, that this does not affect any other properties. In other words, for a property that, within a given RDF-star graph, is not explicitly identified to be a TEP, referential transparency is not enforced and, instead, the default is used (referential opacity).

@hartig
Copy link
Collaborator Author

hartig commented Sep 22, 2021

Notice also that enforcement of referential transparency based on a TEP is only local to the graph(s) in which the TEP is stated to be a TEP (or where this statement can be inferred as per the entailment regime considered).

@afs
Copy link
Collaborator

afs commented Sep 22, 2021

Let me ask it another way - if they "enforce" what else do they stop? (in the local graph)

@hartig
Copy link
Collaborator Author

hartig commented Sep 22, 2021

They don't stop anything else. However, I don't mean to insist on using the word "enforcing".

Considering the alternatives that you have proposed, "permitting" is too weak I think; it sounds more like a possibility rather than a guarantee that referential transparency will indeed be used for TEPs.

"enabling" is okay for me; it can be understood as switching on the usage of referential transparency which is disabled by default.

What do other native speaker think? (@gkellogg @TallTed)

@TallTed
Copy link
Member

TallTed commented Sep 22, 2021

@hartig -- Having only quickly read the comments (not the actual PR, yet), I think "transparency-enabling" conveys what you meant when you said "transparency-enforcing".

cg-spec/editors_draft.html Outdated Show resolved Hide resolved
cg-spec/editors_draft.html Outdated Show resolved Hide resolved
cg-spec/editors_draft.html Outdated Show resolved Hide resolved
cg-spec/editors_draft.html Outdated Show resolved Hide resolved
cg-spec/editors_draft.html Outdated Show resolved Hide resolved
Copy link
Member

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work... with the few tweaks noted above, including the externally discussed change of "enforcing" to "enabling".

@gkellogg
Copy link
Member

@hartig -- Having only quickly read the comments (not the actual PR, yet), I think "transparency-enabling" conveys what you meant when you said "transparency-enforcing".

I agree, if It were "enforcing", that would imply some mechanism to insure that such entailments were realized. It would also imply a specific mechanism whereby the graph would be determined to be somehow invalid if it were determined not to be transparent. "Transparency-enabling" seems more appropriate, since I believe that is what entailments generally provide..

hartig and others added 2 commits September 22, 2021 20:58
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
@rat10
Copy link

rat10 commented Sep 22, 2021

This proposal seems to tackle only referentially transparent types. What about referentially transparent occurrences? Is this vocabulary applicable to occurrences an in EXAMPLE 8?

I can also imagine a more statement-focused approach. Again referring to EXAMPLE 8 a further type

_:a :occurrenceOf << :s :p :o >> ;
    :in <file1.ttl> ;
    a :transparencyEnabledEmbeddedTriple ;
    dct:creator :alice.

or, shorter, a combined declaration of semantics is thinkable:

_:a :referentiallyTransparentOccurrenceOf << :s :p :o >> ;
    :in <file1.ttl> ;
    dct:creator :alice.

This not necessarily instead of but in addition to the approach you propose. IMO this would still be prohibitively cumbersome but at least more in line with common modelling practices.

Would it be possible to declare referential transparency on all quoted triples per graph? Did you consider that as an option?

@rat10
Copy link

rat10 commented Sep 22, 2021

Notice also that enforcement of referential transparency based on a TEP is only local to the graph(s) in which the TEP is stated to be a TEP (or where this statement can be inferred as per the entailment regime considered).

I don't find this explicitly mentioned in the prose. Would be useful to point it out as explicitly to the readers of the report - who surely not all are semanticists that easily extract this information from the model theoretic definitions- as to the members of this CG.

@rat10
Copy link

rat10 commented Sep 22, 2021

In the proposal I read:
Note, however, that the group has not reached consensus on the definition or usefulness of such a vocabulary.
This is a bit lame. More to the point would be something like: "We always promised that it would be easy to derive referentially transparent embedded triples from referentially opaque ones but this is all we were able to come up with when pressed repeatedly." Maybe you can rephrase before I try in earnest?

… TEPs is local to the graph8s) in which they are stated to be TEPs
@hartig
Copy link
Collaborator Author

hartig commented Sep 23, 2021

Thomas,

This proposal seems to tackle only referentially transparent types. What about referentially transparent occurrences? Is this vocabulary applicable to occurrences an in EXAMPLE 8?

Making statements about occurrences of triples and referential transparency vs opacity are orthogonal issues. The vocabulary introduced in this PR is about the latter. As such, it gives you the means to state that the quoted triple in the following statement from your comment is meant to be referentially transparent within that statement.

_:a :referentiallyTransparentOccurrenceOf << :s :p :o >> .

To achieve this, you only have to add the following triple to your data:

:referentiallyTransparentOccurrenceOf rdf:type rdf-star:TransparencyEnablingProperty .

However, what the vocabulary in this PR does not give you is a means to state anything about triple occurrences. That's not the purpose of the proposal in this PR because it is an orthogonal issue. As I can already foresee that you won't be happy with this answer, let me strongly suggest to keep the discussion in this PR on the topic of the proposal made here (namely, the topic of supporting selective referential transparency) and move the discussion of triple occurrences elsewhere (e.g., #169)

Would it be possible to declare referential transparency on all quoted triples per graph? Did you consider that as an option?

Given an RDF-star graph G, you can take all predicates of all the triples in G (asserted and quoted) that have a quoted triples as subject or object and, then, for each of these predicates (more precisely, the properties that are these predicates), add a triple that states that this property is of rdf:type rdf-star:TransparencyEnablingProperty. That way, all quoted triples in G become referentially transparent within all the triples in which they occur.

I am not sure whether this answer addresses your question because I don't know what exactly you mean here when you say "per graph". If you mean something else than what I am considering in my answer, please elaborate a bit more what exactly you mean with this question.

Notice also that enforcement of referential transparency based on a TEP is only local to the graph(s) in which the TEP is stated to be a TEP (or where this statement can be inferred as per the entailment regime considered).

I don't find this explicitly mentioned in the prose. Would be useful to point it out as explicitly to the readers of the report

Good point. Done now. See commit 875e36d

In the proposal I read:
Note, however, that the group has not reached consensus on the definition or usefulness of such a vocabulary.
This is a bit lame. More to the point would be something like: "We always promised that it would be easy to derive referentially transparent embedded triples from referentially opaque ones but this is all we were able to come up with when pressed repeatedly." Maybe you can rephrase before I try in earnest?

The sentence that you quote is an artifact of a first version of this whole new section in which we did not yet have the actual proposal. I think this sentence can be removed now because the PR introduces such a vocabulary, including a formal definition of its semantics.

@rat10
Copy link

rat10 commented Sep 23, 2021

Would it be possible to declare referential transparency on all quoted triples per graph? Did you consider that as an option?

Given an RDF-star graph G, you can take all predicates of all the triples in G (asserted and quoted) that have a quoted triples as subject or object and, then, for each of these predicates (more precisely, the properties that are these predicates), add a triple that states that this property is of rdf:type rdf-star:TransparencyEnablingProperty. That way, all quoted triples in G become referentially transparent within all the triples in which they occur.

I am not sure whether this answer addresses your question because I don't know what exactly you mean here when you say "per graph". If you mean something else than what I am considering in my answer, please elaborate a bit more what exactly you mean with this question.

With "per graph" I mean the same as you when you say "local to the graph". Does that answer your question?
Your answer elaborates on the per your proposal obvious but also very cumbersome option of declaring every single property in a graph as TEP. However judging by the use cases collected by this CG, by the seminal example, by RDF-star turorials all over the web, by looking at typical Labeled Property Graphs etc only very few use cases work on referentially opaque types. For all those use cases, users might find it convenient if they did not have to go through the trouble of declaring each and every property they use as TEP just because some niche uses like explainable AI require referential opacity. They might find it less burdensome to declare per graph or even per dataset that they don't require such fanciness and get along well with good old referential transparency. Or they might be inclined to just not care at all if they feel that the proposed semantics just creates a lot of trouble and burdens for them but doesn't provide them much in return. Just an idea...

@rat10
Copy link

rat10 commented Sep 23, 2021

This proposal seems to tackle only referentially transparent types. What about referentially transparent occurrences? Is this vocabulary applicable to occurrences an in EXAMPLE 8?

Making statements about occurrences of triples and referential transparency vs opacity are orthogonal issues. The vocabulary introduced in this PR is about the latter. As such, it gives you the means to state that the quoted triple in the following statement from your comment is meant to be referentially transparent within that statement.

_:a :referentiallyTransparentOccurrenceOf << :s :p :o >> .

To achieve this, you only have to add the following triple to your data:

:referentiallyTransparentOccurrenceOf rdf:type rdf-star:TransparencyEnablingProperty .

However, what the vocabulary in this PR does not give you is a means to state anything about triple occurrences. That's not the purpose of the proposal in this PR because it is an orthogonal issue. As I can already foresee that you won't be happy with this answer, let me strongly suggest to keep the discussion in this PR on the topic of the proposal made here (namely, the topic of supporting selective referential transparency) and move the discussion of triple occurrences elsewhere (e.g., #169)

As both issues are orthogonal my question would by your definition be as wrong in #169 as it is (by your definition) here. Technically the two issues are certainly unrelated. In practice however they are not. To the contrary as referentially transparent occurrences they represent a very common, even predominant use case as everybody should be well aware of by now. I fear it is quite confusing to have to implement such use cases by two orthogonal techniques: a per-embedding reference to the occurrence and a per-property reference to its referentially transparent representation. But you seem to agree that defining referential transparency on a per-occurrence basis is possible. Extending your example with the following line should give a complete solution:

:referentiallyTransparentOccurrenceOf rdfs:subPropertyOf :occurrenecOf .

May I suggest that you extend your proposal accordingly? IMO this would make common use cases a lot more straight forward than the TEP approach alone.

[EDIT] Rereading I noticed your "strong suggestion", so moving this to #169 now.

@rat10 rat10 mentioned this pull request Sep 24, 2021
@hartig
Copy link
Collaborator Author

hartig commented Sep 25, 2021

@rat10

Your answer elaborates on the per your proposal obvious but also very cumbersome option of declaring every single property in a graph as TEP. However [...] only very few use cases work on referentially opaque types. For all those use cases, users might find it convenient if they did not have to go through the trouble of declaring each and every property they use as TEP just because some niche uses like explainable AI require referential opacity.

I interpret this response of you as follows: You admit that my answer has indeed provided a solution to the problem posed in your question, but you don't like this solution simply because you didn't and still don't like the opacity-by-default semantics in the first place. Is this an accurate interpretation of your response? If yes, I can live with that.

Notice, by the way, that the declaration of properties to be TEPs does not have to be repeated explicitly in every graph. By using properties from a vocabulary that defines these properties to be TEPs, you would get these TEPs "for free."

They might find it less burdensome to declare per graph or even per dataset that they don't require such fanciness and get along well with good old referential transparency.

I would be absolutely fine with extending the vocabulary proposed in this PR with another class (e.g., rdf-star:TransparencyEnabledGraph) that allows users to make such statements. The only trouble is that defining the semantics of this class would require a well-defined means to refer to a graph or a dataset from within an RDF triple, and defining such a means goes way beyond the scope of working on a spec about RDF-star.

@rat10
Copy link

rat10 commented Sep 25, 2021

@rat10

Your answer elaborates on the per your proposal obvious but also very cumbersome option of declaring every single property in a graph as TEP. However [...] only very few use cases work on referentially opaque types. For all those use cases, users might find it convenient if they did not have to go through the trouble of declaring each and every property they use as TEP just because some niche uses like explainable AI require referential opacity.

I interpret this response of you as follows: You admit that my answer has indeed provided a solution to the problem posed in your question, but you don't like this solution simply because you didn't and still don't like the opacity-by-default semantics in the first place. Is this an accurate interpretation of your response? If yes, I can live with that.

I have not made my peace with the semantics of embedded statements defaulting to referential opacity but my topic here is if it is reasonably easy to describe referentially transparent embedded statements (which are the norm in practice). It was an important argument pro the proposed semantics that this would be easy, so the two issues are indeed connected. The solution you propose doesn't meet my expectations not because I don't like the problem but because I don't like the solution you propose.

Notice, by the way, that the declaration of properties to be TEPs does not have to be repeated explicitly in every graph. By using properties from a vocabulary that defines these properties to be TEPs, you would get these TEPs "for free."

Now that you mention it, I'm pondering if that doesn't come dangerously near to two modes that are very hard to distinguish. But I disgress...

They might find it less burdensome to declare per graph or even per dataset that they don't require such fanciness and get along well with good old referential transparency.

I would be absolutely fine with extending the vocabulary proposed in this PR with another class (e.g., rdf-star:TransparencyEnabledGraph) that allows users to make such statements. The only trouble is that defining the semantics of this class would require a well-defined means to refer to a graph or a dataset from within an RDF triple, and defining such a means goes way beyond the scope of working on a spec about RDF-star.

I'm well aware that you're very reluctant to tackle that topic but it should have become clear by now that it can't be avoided completely when defining a syntax and semantics for statement annotation. We have the :in property already (which initially was called :inGraph IIRC) and we have the stub for a term rdf-star:Source. It should be possible to derive a plausible proposal from that basis without solving all the problems about graphs as mathematical abstractions, named graphs etc in RDF. Such a proposal would give the user - especially those that are no fans of the TEP approach - an alternative they can better live with. So it might be worth the effort...

@hartig
Copy link
Collaborator Author

hartig commented Sep 27, 2021

@rat10

I have not made my peace with the semantics of embedded statements defaulting to referential opacity but my topic here is if it is reasonably easy to describe referentially transparent embedded statements (which are the norm in practice). It was an important argument pro the proposed semantics that this would be easy, so the two issues are indeed connected. The solution you propose doesn't meet my expectations not because I don't like the problem but because I don't like the solution you propose.

I see. So, at least you agree that it is a solution, even if you don't like this solution, right?

Notice, by the way, that the declaration of properties to be TEPs does not have to be repeated explicitly in every graph. By using properties from a vocabulary that defines these properties to be TEPs, you would get these TEPs "for free."

Now that you mention it, I'm pondering if that doesn't come dangerously near to two modes that are very hard to distinguish. But I disgress...

I don't think so. The transparency-enabling semantics of the properties would be clearly defined in the vocabulary and nothing needs to be communicated out of band (as would have been the case had we introduced SA mode and PG mode based on the same syntax).

By using a particular vocabulary, users are committing themselves to the semantics of that vocabulary. I think this is not very different from using, e.g., foaf:Person as the rdf:type of things in my data; by the definition of the FOAF vocabulary (and assuming RDFS entailment), these things would also be of the types foaf:Agent and geo:Spatial_Thing.

I'm well aware that you're very reluctant to tackle that topic but it should have become clear by now that it can't be avoided completely when defining a syntax and semantics for statement annotation. We have the :in property already (which initially was called :inGraph IIRC) and we have the stub for a term rdf-star:Source. It should be possible to derive a plausible proposal from that basis without solving all the problems about graphs as mathematical abstractions, named graphs etc in RDF.

Can you give it a try? (perhaps in a follow-up PR once this one here is merged)

@pchampin
Copy link
Collaborator

FTR, I am not at all convinced that the "per graph transparency" has any practical use case, but for the sake of the argument: here is an even simpler solution. Include the following triple in your graph:

rdf:Property rdfs:subClassOf rdf-star:TransparencyEnablingProperty.

This effectively turns any property in the graph into a TEP.

@rat10
Copy link

rat10 commented Sep 28, 2021

@rat10

I have not made my peace with the semantics of embedded statements defaulting to referential opacity but my topic here is if it is reasonably easy to describe referentially transparent embedded statements (which are the norm in practice). It was an important argument pro the proposed semantics that this would be easy, so the two issues are indeed connected. The solution you propose doesn't meet my expectations not because I don't like the problem but because I don't like the solution you propose.

I see. So, at least you agree that it is a solution, even if you don't like this solution, right?

Are you trying to twist words in my mouth, Olaf? There are many kinds of solutions: good, bad, elegant, convoluted, easy, cumbersome, defective etc. If one ignores the meaning denoted by the qualifying attribute, even a defective solution would be a solution. The solution you propose is AFAIKT not defective but it also is not what I would call a good solution, and also not an easy one.

Notice, by the way, that the declaration of properties to be TEPs does not have to be repeated explicitly in every graph. By using properties from a vocabulary that defines these properties to be TEPs, you would get these TEPs "for free."

Now that you mention it, I'm pondering if that doesn't come dangerously near to two modes that are very hard to distinguish. But I disgress...

I don't think so. The transparency-enabling semantics of the properties would be clearly defined in the vocabulary and nothing needs to be communicated out of band (as would have been the case had we introduced SA mode and PG mode based on the same syntax).

Right, it's not that bad as it's defined within the bounds of RDF. It still is not immediatly recognizable, like some syntactic measure as for example an extension to the << s p o >> syntax would be, and since it changes a core characteristics of the RDF semantics I still consider this rather dangerous.

By using a particular vocabulary, users are committing themselves to the semantics of that vocabulary. I think this is not very different from using, e.g., foaf:Person as the rdf:type of things in my data; by the definition of the FOAF vocabulary (and assuming RDFS entailment), these things would also be of the types foaf:Agent and geo:Spatial_Thing.

We have had this discussion already at length in issue #170 (maybe others too, I'm not sure right now) and on the mailing list. I thought that here we just discuss this concrete proposal of yours. So is this really the right place to cintinue this discussion? Or rather, as you seem to be not impressed at all by the arguments I brought forward already, start form scratch again? Following is a rundown of problems that I see immediatly. Most of them I have mentioned already elsewhere. I'm not ready for a thorough analysis yet but I'm already pretty sure that these problems can not be healed.

There are several problems with your argument. You mix fundamental semantic principles of the semantic web with very task specific semantics of vocabularies. There are many ways in which the semantics of a "name", a "type", a "date" or any other property can differ and vocabularies fix those variations to an extend deemed practically useful for the domain of application. However the difference between referentially transparent and opaque semantics is a very fundamental one and shouldn't be tackled at the level of domain specific vocabularies. This is a bad mistake on the architectural level, running counter the principle of separation of concerns.
As I said before and as you don't seem to realize, your proposal tends to double the vocabulary terms we need on the semantic web when leaving the comfort zone of some quite specific applications like provenance (not to mention the niche use cases the proposed semantics is optimized for). Think about labeled propoerty graphs where absolutely every property is eligible to be used as the predicate of a primary relation or as the secondary attribute to a primary relation. And whenever it is used as a secondary attribute you might need to define a TEP twin of it if you want your annotation to apply not only to the syntactic representation of an assertion but to its meaning, including co-denotations, subclasses etc - all the integration-focused use cases that reasoning enables and that depend on referentially transparent semantics. This procedure is extremely counter intuitive as referring to meanings, not to syntactic representations, is not only the default but really the heart of the semantics of the semantic web, its integration-enabling modus operandi. Yet you expect everybody to think twice when re-using a property and if applicable (which should be almost always if the core assumptions behind the architecture of the semantic web are worth anything at all) use its TEP double. And who will define that twin property if the authors of the vocabulary you want to use don't submit to this approach? Have you talked to editors of popular vocabularies liek schema.org what they think of the prospect to mint new TEP properties? With the mechanism you propose everybody can create such TEP variants but how does that approach cope with the obvious problem of balkanization? How many new owl:sameAs statements will we need to create and deal with? Will you propose a standard syntax - like e.g. a "-TEP" suffix for easier findability? How will queries deal with these new properties - do we from now on have to query for a property and its TEPpy twin if we are not sure if the annotation refers to a quoted or interpreted embedded statement? Etc etc...
Maybe the TEP approach would make sense if RDF-star was only targeted at the use cases the proposed semantics is optimized for: discussing un-endorsed viewpoints, explainable AI, versioning - allthough I doubt that too. But when considering RDF-star as a means to facilitate statement annotations in general, to provide more elaborate modelling primitives akin to labeled property graphs, to enable easy provenance annotations that respect and value the default semantics of RDF - as RDF* was originally conceived and then advertized at the Berlin Graph workshop - the proposed semantics is just plain poison. It breaks things in unexpected and unjustified ways, badly. And the TEP proposal can not fix that but only worsens the mess.

These are from the top of my head some problematic aspects of the TEP proposal itself and in the context of the proposed semantics for RDF-star.

I'm well aware that you're very reluctant to tackle that topic but it should have become clear by now that it can't be avoided completely when defining a syntax and semantics for statement annotation. We have the :in property already (which initially was called :inGraph IIRC) and we have the stub for a term rdf-star:Source. It should be possible to derive a plausible proposal from that basis without solving all the problems about graphs as mathematical abstractions, named graphs etc in RDF.

Can you give it a try? (perhaps in a follow-up PR once this one here is merged)

I think that it's your's and Pierre-Antoine's job to give it a try first and develop a coherent and complete proposal, honoring all use cases and exploring plausible usage scenarios. I can ask questions and give feedback but I would really have to bend backwards to actually work on it. The issues with the TEP approach are one of the problems that come with the proposed semantics and I'm reluctant to put even more time into discussing and solving those. The fundamental problem is that the proposed semantics makes referential opacity the default. The failure of your TEP proposal to heal the ensuing rift is not at all surprising to me.
Also my approach to this issue is more closely related to the occurrence vocabulary which is Pierre-Antoine's work anyway.

@rat10
Copy link

rat10 commented Sep 28, 2021

FTR, I am not at all convinced that the "per graph transparency" has any practical use case,

Maybe you should have a look at the use cases collected by this CG again. Very few require referential opacity. If they prefer to not disrupt a core feature of the RDF semantics they would benefit from an easy and concise way to switch off the proposed semantics.

but for the sake of the argument: here is an even simpler solution. Include the following triple in your graph:

rdf:Property rdfs:subClassOf rdf-star:TransparencyEnablingProperty.

This effectively turns any property in the graph into a TEP.

The TEP approach has some surprising features. Yet, IIUC (and remembering what you tought me when discussing #170), it is impossible to again turn on referential opacity for individual annotations once the above triple has been introduced. So this solution only gives the option to shut down the proposed semantics entirely. That's better than nothing IMHO but not good enough since if I want to employ referential opacity on individual annotations - and asI'd like to re-assure you I definitely can see good uses for this scenario - I will have to resort to the TEP approach with all its problems for the rest of my annotations.

@ericprud
Copy link
Member

FTR, I am not at all convinced that the "per graph transparency" has any practical use case, but for the sake of the argument: here is an even simpler solution. Include the following triple in your graph:

rdf:Property rdfs:subClassOf rdf-star:TransparencyEnablingProperty.

This effectively turns any property in the graph into a TEP.

Wouldn't that hit all instances of the property in other graphs?

I find it tempting to use something which allows you to basically express a triple pattern like "anything that was prov:wasAttributedTo by Alice or Bob", e.g.:

# A-box
ex:book1 a ex:Publication .
<< ex:book2 a ex:Article >> prov:wasAttributedTo <Alice> .
<< ex:MITPress ex:publishes ex:book3 >> prov:wasAttributedTo <Bob> .
<< ex:book4 a ex:Publication >> prov:wasAttributedTo <Eve> .

Using OWL for this would put those two quoted triples into a "transparency-enabling" class:

# Transparency stawman
[ owl:onProperty prov:wasAttributedTo ;
  owl:someValuesFrom [
    owl:oneOf (<Alice>, <Bob>)
  ]
] rdfs:subClassOf rdf-star:TransparencyEnablingClass .

(and give you a whole lot more power as well).

That, plus some RDFS:

# RDFS
ex:Article rdfs:subClassOf ex:Publication .
ex:publishes rdfs:range ex:Publication .

could allow a SPARQL engine endowed with RDFS-entailment to emit ex:book{1,2,3} for this query:

SELECT ?book { ?book a ex:Publication }

@pchampin
Copy link
Collaborator

@ericprud I think you are misinterpreting "transparency-enabling" as "auto-asserting", which would be a totally different thing...
As for your question:

Wouldn't that hit all instances of the property in other graphs?

Well, if you merged another graph with this one and reasoned about the result, yes of course, this axiom would propagate to the properties from the other graph. But there is nothing specific to RDF-star or TEPs here. Suppose you asserted foaf:Person owl:equivalentClass :Unicorn in your FOAF profile, and merge it with mine, then you would infer that I am a unicorn.

@ericprud
Copy link
Member

@ericprud I think you are misinterpreting "transparency-enabling" as "auto-asserting", which would be a totally different thing...

Yeah, fair enough. Interestingly, I feel like it's a parametric difference. In my above example, I selectively promoted quoted triples into the KB that the entailment regime worked from. This could be the default graph, or some shadow DB whos inferences were mirrored into the default graph. (The diff being whether an ?s?p?o query on the default graph retrieves those auto-asserted triples.)

For the transparency use case, the assertions go the other way; and Lois says that Clark Kent can fly. The directives to control those could use the same structure, with a little switch to say what goes where.

Some SPARQL DBs treat the default graph as the union of the named graphs. Any such DBs that also support entailment regimes and RDF-Star (speaking of unicorns) would seem to be both universally transparent and universally auto-asserting.

As for your question:

Wouldn't that hit all instances of the property in other graphs?

Well, if you merged another graph with this one and reasoned about the result, yes of course, this axiom would propagate to the properties from the other graph. But there is nothing specific to RDF-star or TEPs here. Suppose you asserted foaf:Person owl:equivalentClass :Unicorn in your FOAF profile, and merge it with mine, then you would infer that I am a unicorn.

You aren't?

I can see practical examples like the one I posted above where you'd want more fine-grained control for auto-assertion. The only use cases I envision for transparency are similar to those for auto-assertion (discovering inferred triples) but for practical reasons, you want that to happen in some named graph rather than the default graph.

For those, I see the use case for having :referentiallyTransparentOccurrenceOf trigger transparency, but it seems reasonable that folks would want to control it with foo:believedBy Alice or Bob (a la my auto-asserting example). Granted, you could use OWL to impute the :referentiallyTransparentOccurrenceOf triples.

OWL is crazy complex and has few users, but profiles of it could be managable and the on-property/some-values-from pattern comes to mind when saying it want to infer either from or into graphs believed by a group with a known extension.

@pchampin
Copy link
Collaborator

@ericprud I am not sure to fully grasp this "parametic difference" perspective, I'll have to think about it.

Some SPARQL DBs treat the default graph as the union of the named graphs. Any such DBs that also support entailment regimes and RDF-Star (speaking of unicorns) would seem to be both universally transparent and universally auto-asserting.

As you know, named graphs do not have a standard semantics. I don't know how inference-capable triple-stores handle cross-graph entailment, but I suspect they don't all do it the same way. So discussing RDF-star semantics in that context seems risky.

@pchampin
Copy link
Collaborator

@rat10

look at the use cases collected by this CG again. Very few require referential opacity.

They do not either require full transparency in my opinion. All of them, I believe, can work with a predefined set of TEPs. Those that are concerned with occurrences would actually only need one TEP: "transparentOccurenceOf", as all other properties would apply on the occurence node, not the quoted triple itself.

I think that it's your's and Pierre-Antoine's job to give it a try first and develop a coherent and complete proposal, honoring all use cases and exploring plausible usage scenarios.

Developing a coherent and complete proposal is the job of the group as a whole -- the job of the editors is to reflect the group's consensus into a consistent document.

My personal opinion is that the current proposal is indeed coherent and complete, or at least reasonably so that we can call it a day and defer to a proper working group. My feeling is that this assessment is generally shared among the group. If you disagree, feel free to make a concrete proposal, on which the group as a whole can decide.

@rat10
Copy link

rat10 commented Sep 29, 2021

@rat10

look at the use cases collected by this CG again. Very few require referential opacity.

They do not either require full transparency in my opinion. All of them, I believe, can work with a predefined set of TEPs.

That notion of _full_ transparency - vs some set of predefined transparency I assume - is totally new to me and IMO deserves some clarification. I did once check all use cases w.r.t. occurrence vs type and opaque vs referential transparency. Maybe you can do the same thing as a start to put some proof beside your belief. But I'd also caution that use cases tend to repeat well known scenarios and to hat end also vocabularies (if they even bother to use somethig outside the exnamespace). To truly understand how many properties would need a TEP counterpart and to successfully uphold the claim that it's only a manageable set some more diligence seems necessary.

Those that are concerned with occurrences would actually only need one TEP: "transparentOccurenceOf", as all other properties would apply on the occurence node, not the quoted triple itself.

Right, and I'm glad to see that as a result declaring an identifier for a referentially transparent occurrence still requires "only" two triples, and I'm glad that I could provide the input that provoked this insight. Yet, IMO this is already too much to be viable in practice (and of course doesn't fit the embedded triples already in use). It's also not the way that your TEP proposal favors to handle things so is not really the core topic here.

I think that it's your's and Pierre-Antoine's job to give it a try first and develop a coherent and complete proposal, honoring all use cases and exploring plausible usage scenarios.

Developing a coherent and complete proposal is the job of the group as a whole -- the job of the editors is to reflect the group's consensus into a consistent document.

Maybe you wore the wrong hat when you wrote this. I wasn't referring to you and Olaf as editors (and forgetting Andy and Gregg) but as proponents of the proposed semantics and especially authors of the proposal we are discussing in this issue.

Especially you, Pierre-Antoine, have always claimed and promised that it would be easy to
annotate referentially transparent embedded triples, even if the default semantics is one of referentially opaque types. This claim and promise I believe was essential to getting the support of the majority of the community group for the proposed semantics. Now I'm asking you to live up to that promise and claim.

My personal opinion is that the current proposal is indeed coherent and complete, or at least reasonably so that we can call it a day and defer to a proper working group. My feeling is that this assessment is generally shared among the group. If you disagree, feel free to make a concrete proposal, on which the group as a whole can decide.

I have asked for a concrete proposal for a long time now, at least since February IIRC. Now, since one week, we have one, and I have reservations and am asking for clarifications and features. It is not surprising that the proposal isn't finished or complete yet but it is indeed surprising that you already would like to call it a day and let me do the missing work.

Given the few comments this proposal has received so far I also wonder how you can be so sure that the group agrees with your feelings about the coherence and completeness of the proposal.

B.t.w. none of my questions w.r.t. querying have been addressed so far. Queryabiity is an important factor in usability and user acceptance.

@hartig
Copy link
Collaborator Author

hartig commented Sep 30, 2021

@rat10 in response to your response from 2 days ago, #209 (comment)

I see. So, at least you agree that it is a solution, even if you don't like this solution, right?

Are you trying to twist words in my mouth, Olaf?

No, I just wanted to get a confirmation that you don't see any technical flaws with the solution presented in this PR.

The transparency-enabling semantics of the properties would be clearly defined in the vocabulary and nothing needs to be communicated out of band (as would have been the case had we introduced SA mode and PG mode based on the same syntax).

Right, it's not that bad as it's defined within the bounds of RDF. It still is not immediatly recognizable, [...]

Which is not different from any other usage of an IRI defined in some RDF vocabulary. For instance, coming back to my earlier example with the FOAF vocabulary, given a triple (:Alice, rdf:type, foaf:Person), it is also not immediately recognizable that :Alice is of the types foaf:Agent and geo:Spatial_Thing.

[...] and since it changes a core characteristics of the RDF semantics I still consider this rather dangerous.

The RDF semantics (and, thus, also its "core characteristics", whatever you mean by that) is not touched a single bit by the possibility to declare TEPs in RDF vocabularies or by the notion of TEPs in general (and not even by the way the semantics of RDF-star is defined).

We have had this discussion already at length in issue #170 (maybe others too, I'm not sure right now) and on the mailing list. I thought that here we just discuss this concrete proposal of yours. So is this really the right place to cintinue this discussion? Or rather, as you seem to be not impressed at all by the arguments I brought forward already, start form scratch again?

I am indeed not impressed by your arguments and, instead, stand by all my responses that I gave you in #170. Most relevant to the proposal in this PR, I still do not agree with your claim that our "proposal tends to double the vocabulary terms we need on the semantic web [because of the] need to define a TEP twin [for a lot of properties]." In #170 (comment), I have explained why this claim does not make sense to me. However, let me also repeat my main points here:

First of all, one of the assumptions that your claim seems to be based on is that almost all properties make sense to be used as a predicate in nested RDF-star triples (i.e., to make a statement about a quoted triple). I don't think this assumption is correct. In contrast, I am not convinced that many of the properties defined in vocabularies make sense at all to be used as a predicate in nested RDF-star triples (remember your unconvincing attempt with a :color and :type property; #170 (comment)).

Second, for properties for which it does make sense to use them as a predicate in nested RDF-star triples, your claim assumes that most of them make sense to be interpreted with both semantics for quoted triples (i.e., sometimes with a referential opacity semantics and sometimes with a referential transparency semantics). I disagree also with this assumption, and you have not provided any arguments, or at least an example, to convince me otherwise. The way I see it is that, for most of the properties, only one of the two semantics is a natural choice. In other words, for each such property, it is a part of the meaning of that property whether quoted triples are considered referentially transparent/opaque in statements that use the property (and the notion of TEPs give us a means to make exactly this distinction).

Again, in #170 (comment), I have elaborated on these points based on one of your examples.

@hartig
Copy link
Collaborator Author

hartig commented Sep 30, 2021

@rat10 In your comments above you have requested a way to enable referential transparency for a whole graph, or even dataset, and you have suggested that "it should be possible to derive a plausible proposal from" something you outline in one of these comments (#209 (comment)). Regarding my question whether you can give it a try to define such a proposal, your response now is:

I think that it's your's and Pierre-Antoine's job to give it a try first and develop a coherent and complete proposal, honoring all use cases and exploring plausible usage scenarios.

I don't think so. As I have already mentioned above, defining the semantics of what you outline can be done only by relying on a well-defined means to refer to a graph or a dataset from within an RDF triple. To the best of my knowledge, such a means does not exist. Given that such a means would actually not have anything to do with RDF-star per se, I don't see defining such a means as something that can be expected of us as part of the work on RDF-star. In this sense, as Pierre-Antoine, I see our work as coherent and sufficiently complete. RDF-star can readily be used for some use cases; for others, RDF-star is a building block, and some of these other use cases require additional building blocks that are not in place yet.

The issues with the TEP approach are one of the problems that come with the proposed semantics and I'm reluctant to put even more time into discussing and solving those. The fundamental problem is that the proposed semantics makes referential opacity the default.

Regarding "the issues with the TEP approach," everything that you are listing in your comment are non-issues in my opinion (see my other comment that I have just posted before). The question how to define a vocabulary with which users can succinctly state that referential transparency is meant to be enabled for a whole graph or dataset is not an issue of the TEP approach; rather, it is an open problem that relies on other building blocks to be in place. Of course, these questions have surfaced only because our semantics of RDF-star makes referential opacity the default. However, as mentioned several times, I see it as a feature to have referential opacity as default (because, this way, RDF-star can be used both for referential-opacity use cases and for referential-transparency use cases, as clearly demonstrated by the TEP approach in this PR; by using the alternative, i.e., referential transparency as default, it is only possible to cover referential-transparency use cases).

@rat10
Copy link

rat10 commented Oct 1, 2021

@rat10 in response to your response from 2 days ago, #209 (comment)

[...] and since it changes a core characteristics of the RDF semantics I still consider this rather dangerous.

The RDF semantics (and, thus, also its "core characteristics", whatever you mean by that)

Referential transparency is a feature of the RDF semantics, encoded in the RDF specifiactions and instrumental (or even: of utmost importance) to RDF's main purpose: data integration. That is what I mean with 'core'. Semantics of vocabulary properties OTOH are concerned with vocabulary/application specific meanings and don't touch the core semantics of RDF. Your TEP proposal changes that profoundly and that is a problem.

is not touched a single bit by the possibility to declare TEPs in RDF vocabularies or by the notion of TEPs in general (and not even by the way the semantics of RDF-star is defined).

We have had this discussion already at length in issue #170 (maybe others too, I'm not sure right now) and on the mailing list. I thought that here we just discuss this concrete proposal of yours. So is this really the right place to cintinue this discussion? Or rather, as you seem to be not impressed at all by the arguments I brought forward already, start form scratch again?

I am indeed not impressed by your arguments and, instead, stand by all my responses that I gave you in #170. Most relevant to the proposal in this PR, I still do not agree with your claim that our "proposal tends to double the vocabulary terms we need on the semantic web [because of the] need to define a TEP twin [for a lot of properties]." In #170 (comment), I have explained why this claim does not make sense to me. However, let me also repeat my main points here:

First of all, one of the assumptions that your claim seems to be based on is that almost all properties make sense to be used as a predicate in nested RDF-star triples (i.e., to make a statement about a quoted triple). I don't think this assumption is correct. In contrast, I am not convinced that many of the properties defined in vocabularies make sense at all to be used as a predicate in nested RDF-star triples (remember your unconvincing attempt with a :color and :type property; #170 (comment)).

I stand by that claim. Let me give some hopefully more convincing examples:

:Alice :buys :car .
<< :Alice :buys :car >> :paymentMethod :Cash ;
  :purpose :Commuting ;
  :reason :OldCarBreakdown .

:Alice :knows :Bob .
<< :Alice :knows :Bob >> :since 1979 ;
  :from :Work ; 
  :aquaintanceLevel :Professional .

:Plane :crashesIn :Hudson .
<< :Plane :crashesIn :Hudson >> :date 2009 ;
  :pilot "Chesley 'Sully' Sullenberger" ;
  :coPilot "Jeffrey Skiles" ;
  :flightNumber 1549 ;
  :casulaties 0 .

As you can imagine I could go on and on. Essentially all these examples annotate a primary relation with secondary detail which is not your usual provenance use case but very common all the same. It is the modelling primitive that made Labeled Property Graphs so immensly successful, especially compared to RDF. In all these examples the nodes in the annotated statements would profit from referential transparency as it helps data integration when we can easily establish that :Alice refers to the same person as :AliceSprings, :Bob is that Bob, the Hudson could be refered to by its Wikipedia or Wikidata or DBpedia entry etc etc. So for all those middle-of-the-road properties like :paymentMethod, :purpose, :reason, :since, :from, :aquaintanceLevel, :date, :pilot, :coPilot, :flightNumber, :casulaties etc you'd have to define TEP equivalents to capture all those useful co-denotations.
Which means all vocabulary authors in the world will now have to sit down and mint new TEP-properties or at least re-define the semantics of their established properties. I don't think that prospect meets the general expectation that RDF-star will bring easy statement annotation.

Second, for properties for which it does make sense to use them as a predicate in nested RDF-star triples, your claim assumes that most of them make sense to be interpreted with both semantics for quoted triples (i.e., sometimes with a referential opacity semantics and sometimes with a referential transparency semantics). I disagree also with this assumption, and you have not provided any arguments, or at least an example, to convince me otherwise. The way I see it is that, for most of the properties, only one of the two semantics is a natural choice. In other words, for each such property, it is a part of the meaning of that property whether quoted triples are considered referentially transparent/opaque in statements that use the property (and the notion of TEPs give us a means to make exactly this distinction).

Outside the usual provenance suspects it's harder to find properties that in general require referential transparency but sometimes opacity. OTOH it will be quite hard to find many properties that only and always require referential opacity (the usual provenance suspects are NOT among them) as that is a quite special and niche requirement. That means that the vast majority of properties has at least to be updated and for all provenance related properties both versions will have to be provided. In case you doubt the latter claim look at the examples above: none of them would profit from quoted provenance semantics. Of course there ale also just as many examples that would - so you need both.

Vocabularies are not the solution, syntax is.

The underlying problem is that the proposed semantics supports only very specific use cases that require precise quotation. If it did only aim at those specific use cases that would not be a problem at all. But unfortunately it stands in the way of everything else as it also claims to provide a solution to the much more prolific issue of statement annotation with standard, referentially transparent semantics. It gets into trouble because any attempt to solve that problem on the vocabulary level will always have to reach into and modify standard use cases. This problem can only be solved by another syntactic feature. Either extend the chevron syntax by a means to disambiguate quoted types from interpreted occurrences, like e.g.

<< :s :p :o  "" >>  // for the quoted type per the proposed semantics

or add an RDF/XML-style statement identifier to Turtle that refers to the subject in the pair of occurrence defining triples.

:s :p :o id:1 .
id:1 :occurrenceOf << :s :p :o >> ;  :in <> . // this is implied by the identifier syntax
id:1 :a :b ;
  :c :d .

Especially the latter approach, a concise statement identifier syntax, backed by ccurrence defining vocabulary, would meet my idea of an easy way to refer to referentially transparent occurrences and, b.t.w., all of a sudden RDF-star would have my fullest support.
The TEP approach however IMHO is bound to fail in practice, just like the proposed semantics in general if it doesn't provide an easier path from embedded quoted types to referentially transparent occurrences.

Again, in #170 (comment), I have elaborated on these points based on one of your examples.

@pchampin
Copy link
Collaborator

pchampin commented Oct 1, 2021

I stand by that claim. Let me give some hopefully more convincing examples: (...)

I don't think @hartig 's point was to deny that many properties could accept triples as their subject, but that any arbitrary property could be used that way. In your examples above, :buys, :knows or :crashesIn would hardly make sense with a triple as their subject.

Now, for those properties than can sensibly be used with quoted triples...

for all those properties [...] you'd have to define TEP equivalents

No, you would simply have to decide, for each of them, whether they are a TEP or not! Just like you have to decide what their domain or range is.

@hartig
Copy link
Collaborator Author

hartig commented Oct 1, 2021

@rat10 for these new examples, are you assuming that all these properties (:paymentMethod, :purpose, ..., :coPilot, :flightNumber, :casulaties) are from a vocabulary that has not been designed to be used with RDF-star? (which is probably the case for almost all of the vocabularies out there at the moment)

@pchampin
Copy link
Collaborator

pchampin commented Oct 1, 2021

merged, as resolved during today's meeting:
https://w3c.github.io/rdf-star/Minutes/2021-10-01.html#r02

@gkellogg gkellogg deleted the issue-202 branch November 13, 2021 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semantics About the semantics of RDF-star vocabulary About the vocabulary describing RDF-star elements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants