-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New section about selective referential transparency #209
Conversation
…tively enable referential transparency for specific properties
…riples that are in asserted triples
…riples that are in asserted triples
…e unstar algorithm and a clarification comment
Minor comment: "enforcing" seems to me to imply "only usage". The non-transparency usage is valid, RDF being monotonic. "permitting"? "allowing"? Or to keep the E : "enabling"? |
Properties identified as TEPs are indeed meant to enforce referential transparency. Note, however, that this does not affect any other properties. In other words, for a property that, within a given RDF-star graph, is not explicitly identified to be a TEP, referential transparency is not enforced and, instead, the default is used (referential opacity). |
Notice also that enforcement of referential transparency based on a TEP is only local to the graph(s) in which the TEP is stated to be a TEP (or where this statement can be inferred as per the entailment regime considered). |
Let me ask it another way - if they "enforce" what else do they stop? (in the local graph) |
They don't stop anything else. However, I don't mean to insist on using the word "enforcing". Considering the alternatives that you have proposed, "permitting" is too weak I think; it sounds more like a possibility rather than a guarantee that referential transparency will indeed be used for TEPs. "enabling" is okay for me; it can be understood as switching on the usage of referential transparency which is disabled by default. |
@hartig -- Having only quickly read the comments (not the actual PR, yet), I think "transparency-enabling" conveys what you meant when you said "transparency-enforcing". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work... with the few tweaks noted above, including the externally discussed change of "enforcing" to "enabling".
I agree, if It were "enforcing", that would imply some mechanism to insure that such entailments were realized. It would also imply a specific mechanism whereby the graph would be determined to be somehow invalid if it were determined not to be transparent. "Transparency-enabling" seems more appropriate, since I believe that is what entailments generally provide.. |
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
This proposal seems to tackle only referentially transparent types. What about referentially transparent occurrences? Is this vocabulary applicable to occurrences an in EXAMPLE 8? I can also imagine a more statement-focused approach. Again referring to EXAMPLE 8 a further type
or, shorter, a combined declaration of semantics is thinkable:
This not necessarily instead of but in addition to the approach you propose. IMO this would still be prohibitively cumbersome but at least more in line with common modelling practices. Would it be possible to declare referential transparency on all quoted triples per graph? Did you consider that as an option? |
I don't find this explicitly mentioned in the prose. Would be useful to point it out as explicitly to the readers of the report - who surely not all are semanticists that easily extract this information from the model theoretic definitions- as to the members of this CG. |
In the proposal I read: |
… TEPs is local to the graph8s) in which they are stated to be TEPs
Thomas,
Making statements about occurrences of triples and referential transparency vs opacity are orthogonal issues. The vocabulary introduced in this PR is about the latter. As such, it gives you the means to state that the quoted triple in the following statement from your comment is meant to be referentially transparent within that statement.
To achieve this, you only have to add the following triple to your data:
However, what the vocabulary in this PR does not give you is a means to state anything about triple occurrences. That's not the purpose of the proposal in this PR because it is an orthogonal issue. As I can already foresee that you won't be happy with this answer, let me strongly suggest to keep the discussion in this PR on the topic of the proposal made here (namely, the topic of supporting selective referential transparency) and move the discussion of triple occurrences elsewhere (e.g., #169)
Given an RDF-star graph G, you can take all predicates of all the triples in G (asserted and quoted) that have a quoted triples as subject or object and, then, for each of these predicates (more precisely, the properties that are these predicates), add a triple that states that this property is of rdf:type rdf-star:TransparencyEnablingProperty. That way, all quoted triples in G become referentially transparent within all the triples in which they occur. I am not sure whether this answer addresses your question because I don't know what exactly you mean here when you say "per graph". If you mean something else than what I am considering in my answer, please elaborate a bit more what exactly you mean with this question.
Good point. Done now. See commit 875e36d
The sentence that you quote is an artifact of a first version of this whole new section in which we did not yet have the actual proposal. I think this sentence can be removed now because the PR introduces such a vocabulary, including a formal definition of its semantics. |
With "per graph" I mean the same as you when you say "local to the graph". Does that answer your question? |
As both issues are orthogonal my question would by your definition be as wrong in #169 as it is (by your definition) here. Technically the two issues are certainly unrelated. In practice however they are not. To the contrary as referentially transparent occurrences they represent a very common, even predominant use case as everybody should be well aware of by now. I fear it is quite confusing to have to implement such use cases by two orthogonal techniques: a per-embedding reference to the occurrence and a per-property reference to its referentially transparent representation. But you seem to agree that defining referential transparency on a per-occurrence basis is possible. Extending your example with the following line should give a complete solution:
May I suggest that you extend your proposal accordingly? IMO this would make common use cases a lot more straight forward than the TEP approach alone. [EDIT] Rereading I noticed your "strong suggestion", so moving this to #169 now. |
I interpret this response of you as follows: You admit that my answer has indeed provided a solution to the problem posed in your question, but you don't like this solution simply because you didn't and still don't like the opacity-by-default semantics in the first place. Is this an accurate interpretation of your response? If yes, I can live with that. Notice, by the way, that the declaration of properties to be TEPs does not have to be repeated explicitly in every graph. By using properties from a vocabulary that defines these properties to be TEPs, you would get these TEPs "for free."
I would be absolutely fine with extending the vocabulary proposed in this PR with another class (e.g., |
I have not made my peace with the semantics of embedded statements defaulting to referential opacity but my topic here is if it is reasonably easy to describe referentially transparent embedded statements (which are the norm in practice). It was an important argument pro the proposed semantics that this would be easy, so the two issues are indeed connected. The solution you propose doesn't meet my expectations not because I don't like the problem but because I don't like the solution you propose.
Now that you mention it, I'm pondering if that doesn't come dangerously near to two modes that are very hard to distinguish. But I disgress...
I'm well aware that you're very reluctant to tackle that topic but it should have become clear by now that it can't be avoided completely when defining a syntax and semantics for statement annotation. We have the :in property already (which initially was called :inGraph IIRC) and we have the stub for a term rdf-star:Source. It should be possible to derive a plausible proposal from that basis without solving all the problems about graphs as mathematical abstractions, named graphs etc in RDF. Such a proposal would give the user - especially those that are no fans of the TEP approach - an alternative they can better live with. So it might be worth the effort... |
I see. So, at least you agree that it is a solution, even if you don't like this solution, right?
I don't think so. The transparency-enabling semantics of the properties would be clearly defined in the vocabulary and nothing needs to be communicated out of band (as would have been the case had we introduced SA mode and PG mode based on the same syntax). By using a particular vocabulary, users are committing themselves to the semantics of that vocabulary. I think this is not very different from using, e.g.,
Can you give it a try? (perhaps in a follow-up PR once this one here is merged) |
FTR, I am not at all convinced that the "per graph transparency" has any practical use case, but for the sake of the argument: here is an even simpler solution. Include the following triple in your graph:
This effectively turns any property in the graph into a TEP. |
Are you trying to twist words in my mouth, Olaf? There are many kinds of solutions: good, bad, elegant, convoluted, easy, cumbersome, defective etc. If one ignores the meaning denoted by the qualifying attribute, even a defective solution would be a solution. The solution you propose is AFAIKT not defective but it also is not what I would call a good solution, and also not an easy one.
Right, it's not that bad as it's defined within the bounds of RDF. It still is not immediatly recognizable, like some syntactic measure as for example an extension to the << s p o >> syntax would be, and since it changes a core characteristics of the RDF semantics I still consider this rather dangerous.
We have had this discussion already at length in issue #170 (maybe others too, I'm not sure right now) and on the mailing list. I thought that here we just discuss this concrete proposal of yours. So is this really the right place to cintinue this discussion? Or rather, as you seem to be not impressed at all by the arguments I brought forward already, start form scratch again? Following is a rundown of problems that I see immediatly. Most of them I have mentioned already elsewhere. I'm not ready for a thorough analysis yet but I'm already pretty sure that these problems can not be healed. There are several problems with your argument. You mix fundamental semantic principles of the semantic web with very task specific semantics of vocabularies. There are many ways in which the semantics of a "name", a "type", a "date" or any other property can differ and vocabularies fix those variations to an extend deemed practically useful for the domain of application. However the difference between referentially transparent and opaque semantics is a very fundamental one and shouldn't be tackled at the level of domain specific vocabularies. This is a bad mistake on the architectural level, running counter the principle of separation of concerns. These are from the top of my head some problematic aspects of the TEP proposal itself and in the context of the proposed semantics for RDF-star.
I think that it's your's and Pierre-Antoine's job to give it a try first and develop a coherent and complete proposal, honoring all use cases and exploring plausible usage scenarios. I can ask questions and give feedback but I would really have to bend backwards to actually work on it. The issues with the TEP approach are one of the problems that come with the proposed semantics and I'm reluctant to put even more time into discussing and solving those. The fundamental problem is that the proposed semantics makes referential opacity the default. The failure of your TEP proposal to heal the ensuing rift is not at all surprising to me. |
Maybe you should have a look at the use cases collected by this CG again. Very few require referential opacity. If they prefer to not disrupt a core feature of the RDF semantics they would benefit from an easy and concise way to switch off the proposed semantics.
The TEP approach has some surprising features. Yet, IIUC (and remembering what you tought me when discussing #170), it is impossible to again turn on referential opacity for individual annotations once the above triple has been introduced. So this solution only gives the option to shut down the proposed semantics entirely. That's better than nothing IMHO but not good enough since if I want to employ referential opacity on individual annotations - and asI'd like to re-assure you I definitely can see good uses for this scenario - I will have to resort to the TEP approach with all its problems for the rest of my annotations. |
Wouldn't that hit all instances of the property in other graphs? I find it tempting to use something which allows you to basically express a triple pattern like "anything that was prov:wasAttributedTo by Alice or Bob", e.g.: # A-box
ex:book1 a ex:Publication .
<< ex:book2 a ex:Article >> prov:wasAttributedTo <Alice> .
<< ex:MITPress ex:publishes ex:book3 >> prov:wasAttributedTo <Bob> .
<< ex:book4 a ex:Publication >> prov:wasAttributedTo <Eve> . Using OWL for this would put those two quoted triples into a "transparency-enabling" class: # Transparency stawman
[ owl:onProperty prov:wasAttributedTo ;
owl:someValuesFrom [
owl:oneOf (<Alice>, <Bob>)
]
] rdfs:subClassOf rdf-star:TransparencyEnablingClass . (and give you a whole lot more power as well). That, plus some RDFS: # RDFS
ex:Article rdfs:subClassOf ex:Publication .
ex:publishes rdfs:range ex:Publication . could allow a SPARQL engine endowed with RDFS-entailment to emit SELECT ?book { ?book a ex:Publication } |
@ericprud I think you are misinterpreting "transparency-enabling" as "auto-asserting", which would be a totally different thing...
Well, if you merged another graph with this one and reasoned about the result, yes of course, this axiom would propagate to the properties from the other graph. But there is nothing specific to RDF-star or TEPs here. Suppose you asserted |
Yeah, fair enough. Interestingly, I feel like it's a parametric difference. In my above example, I selectively promoted quoted triples into the KB that the entailment regime worked from. This could be the default graph, or some shadow DB whos inferences were mirrored into the default graph. (The diff being whether an ?s?p?o query on the default graph retrieves those auto-asserted triples.) For the transparency use case, the assertions go the other way; and Lois says that Clark Kent can fly. The directives to control those could use the same structure, with a little switch to say what goes where. Some SPARQL DBs treat the default graph as the union of the named graphs. Any such DBs that also support entailment regimes and RDF-Star (speaking of unicorns) would seem to be both universally transparent and universally auto-asserting.
You aren't? I can see practical examples like the one I posted above where you'd want more fine-grained control for auto-assertion. The only use cases I envision for transparency are similar to those for auto-assertion (discovering inferred triples) but for practical reasons, you want that to happen in some named graph rather than the default graph. For those, I see the use case for having OWL is crazy complex and has few users, but profiles of it could be managable and the on-property/some-values-from pattern comes to mind when saying it want to infer either from or into graphs believed by a group with a known extension. |
@ericprud I am not sure to fully grasp this "parametic difference" perspective, I'll have to think about it.
As you know, named graphs do not have a standard semantics. I don't know how inference-capable triple-stores handle cross-graph entailment, but I suspect they don't all do it the same way. So discussing RDF-star semantics in that context seems risky. |
They do not either require full transparency in my opinion. All of them, I believe, can work with a predefined set of TEPs. Those that are concerned with occurrences would actually only need one TEP: "transparentOccurenceOf", as all other properties would apply on the occurence node, not the quoted triple itself.
Developing a coherent and complete proposal is the job of the group as a whole -- the job of the editors is to reflect the group's consensus into a consistent document. My personal opinion is that the current proposal is indeed coherent and complete, or at least reasonably so that we can call it a day and defer to a proper working group. My feeling is that this assessment is generally shared among the group. If you disagree, feel free to make a concrete proposal, on which the group as a whole can decide. |
That notion of
Right, and I'm glad to see that as a result declaring an identifier for a referentially transparent occurrence still requires "only" two triples, and I'm glad that I could provide the input that provoked this insight. Yet, IMO this is already too much to be viable in practice (and of course doesn't fit the embedded triples already in use). It's also not the way that your TEP proposal favors to handle things so is not really the core topic here.
Maybe you wore the wrong hat when you wrote this. I wasn't referring to you and Olaf as editors (and forgetting Andy and Gregg) but as proponents of the proposed semantics and especially authors of the proposal we are discussing in this issue. Especially you, Pierre-Antoine, have always claimed and promised that it would be easy to
I have asked for a concrete proposal for a long time now, at least since February IIRC. Now, since one week, we have one, and I have reservations and am asking for clarifications and features. It is not surprising that the proposal isn't finished or complete yet but it is indeed surprising that you already would like to call it a day and let me do the missing work. Given the few comments this proposal has received so far I also wonder how you can be so sure that the group agrees with your feelings about the coherence and completeness of the proposal. B.t.w. none of my questions w.r.t. querying have been addressed so far. Queryabiity is an important factor in usability and user acceptance. |
@rat10 in response to your response from 2 days ago, #209 (comment)
No, I just wanted to get a confirmation that you don't see any technical flaws with the solution presented in this PR.
Which is not different from any other usage of an IRI defined in some RDF vocabulary. For instance, coming back to my earlier example with the FOAF vocabulary, given a triple (
The RDF semantics (and, thus, also its "core characteristics", whatever you mean by that) is not touched a single bit by the possibility to declare TEPs in RDF vocabularies or by the notion of TEPs in general (and not even by the way the semantics of RDF-star is defined).
I am indeed not impressed by your arguments and, instead, stand by all my responses that I gave you in #170. Most relevant to the proposal in this PR, I still do not agree with your claim that our "proposal tends to double the vocabulary terms we need on the semantic web [because of the] need to define a TEP twin [for a lot of properties]." In #170 (comment), I have explained why this claim does not make sense to me. However, let me also repeat my main points here: First of all, one of the assumptions that your claim seems to be based on is that almost all properties make sense to be used as a predicate in nested RDF-star triples (i.e., to make a statement about a quoted triple). I don't think this assumption is correct. In contrast, I am not convinced that many of the properties defined in vocabularies make sense at all to be used as a predicate in nested RDF-star triples (remember your unconvincing attempt with a Second, for properties for which it does make sense to use them as a predicate in nested RDF-star triples, your claim assumes that most of them make sense to be interpreted with both semantics for quoted triples (i.e., sometimes with a referential opacity semantics and sometimes with a referential transparency semantics). I disagree also with this assumption, and you have not provided any arguments, or at least an example, to convince me otherwise. The way I see it is that, for most of the properties, only one of the two semantics is a natural choice. In other words, for each such property, it is a part of the meaning of that property whether quoted triples are considered referentially transparent/opaque in statements that use the property (and the notion of TEPs give us a means to make exactly this distinction). Again, in #170 (comment), I have elaborated on these points based on one of your examples. |
@rat10 In your comments above you have requested a way to enable referential transparency for a whole graph, or even dataset, and you have suggested that "it should be possible to derive a plausible proposal from" something you outline in one of these comments (#209 (comment)). Regarding my question whether you can give it a try to define such a proposal, your response now is:
I don't think so. As I have already mentioned above, defining the semantics of what you outline can be done only by relying on a well-defined means to refer to a graph or a dataset from within an RDF triple. To the best of my knowledge, such a means does not exist. Given that such a means would actually not have anything to do with RDF-star per se, I don't see defining such a means as something that can be expected of us as part of the work on RDF-star. In this sense, as Pierre-Antoine, I see our work as coherent and sufficiently complete. RDF-star can readily be used for some use cases; for others, RDF-star is a building block, and some of these other use cases require additional building blocks that are not in place yet.
Regarding "the issues with the TEP approach," everything that you are listing in your comment are non-issues in my opinion (see my other comment that I have just posted before). The question how to define a vocabulary with which users can succinctly state that referential transparency is meant to be enabled for a whole graph or dataset is not an issue of the TEP approach; rather, it is an open problem that relies on other building blocks to be in place. Of course, these questions have surfaced only because our semantics of RDF-star makes referential opacity the default. However, as mentioned several times, I see it as a feature to have referential opacity as default (because, this way, RDF-star can be used both for referential-opacity use cases and for referential-transparency use cases, as clearly demonstrated by the TEP approach in this PR; by using the alternative, i.e., referential transparency as default, it is only possible to cover referential-transparency use cases). |
Referential transparency is a feature of the RDF semantics, encoded in the RDF specifiactions and instrumental (or even: of utmost importance) to RDF's main purpose: data integration. That is what I mean with 'core'. Semantics of vocabulary properties OTOH are concerned with vocabulary/application specific meanings and don't touch the core semantics of RDF. Your TEP proposal changes that profoundly and that is a problem.
I stand by that claim. Let me give some hopefully more convincing examples:
As you can imagine I could go on and on. Essentially all these examples annotate a primary relation with secondary detail which is not your usual provenance use case but very common all the same. It is the modelling primitive that made Labeled Property Graphs so immensly successful, especially compared to RDF. In all these examples the nodes in the annotated statements would profit from referential transparency as it helps data integration when we can easily establish that :Alice refers to the same person as :AliceSprings, :Bob is that Bob, the Hudson could be refered to by its Wikipedia or Wikidata or DBpedia entry etc etc. So for all those middle-of-the-road properties like
Outside the usual provenance suspects it's harder to find properties that in general require referential transparency but sometimes opacity. OTOH it will be quite hard to find many properties that only and always require referential opacity (the usual provenance suspects are NOT among them) as that is a quite special and niche requirement. That means that the vast majority of properties has at least to be updated and for all provenance related properties both versions will have to be provided. In case you doubt the latter claim look at the examples above: none of them would profit from quoted provenance semantics. Of course there ale also just as many examples that would - so you need both. Vocabularies are not the solution, syntax is. The underlying problem is that the proposed semantics supports only very specific use cases that require precise quotation. If it did only aim at those specific use cases that would not be a problem at all. But unfortunately it stands in the way of everything else as it also claims to provide a solution to the much more prolific issue of statement annotation with standard, referentially transparent semantics. It gets into trouble because any attempt to solve that problem on the vocabulary level will always have to reach into and modify standard use cases. This problem can only be solved by another syntactic feature. Either extend the chevron syntax by a means to disambiguate quoted types from interpreted occurrences, like e.g.
or add an RDF/XML-style statement identifier to Turtle that refers to the subject in the pair of occurrence defining triples.
Especially the latter approach, a concise statement identifier syntax, backed by ccurrence defining vocabulary, would meet my idea of an easy way to refer to referentially transparent occurrences and, b.t.w., all of a sudden RDF-star would have my fullest support.
|
I don't think @hartig 's point was to deny that many properties could accept triples as their subject, but that any arbitrary property could be used that way. In your examples above, Now, for those properties than can sensibly be used with quoted triples...
No, you would simply have to decide, for each of them, whether they are a TEP or not! Just like you have to decide what their domain or range is. |
@rat10 for these new examples, are you assuming that all these properties ( |
merged, as resolved during today's meeting: |
This PR is meant to address issue #202 by adding a new section (Sec.6.4.6) about selective referential transparency. The section includes both i) a concrete proposal how users can indicate that a property is a so-called transparency-enforcing property (i.e., quoted triples are meant to be referentially transparent when used in nested triples with such a property; see example in the new text) and ii) a concrete proposal to define the semantics of such transparency-enforcing properties on top of our RDF-star semantics (by which referential opacity is the default).
(this PR is joint work by @pchampin and me)
/cc @rat10
Preview | Diff