Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triple Terms in Subject Position #138

Open
rat10 opened this issue Jan 17, 2025 · 14 comments
Open

Triple Terms in Subject Position #138

rat10 opened this issue Jan 17, 2025 · 14 comments
Labels
ms:CR Milestone: Candidate Recommendation spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature

Comments

@rat10
Copy link

rat10 commented Jan 17, 2025

Should triple terms be allowed in subject and object position of triples and triple terms or only in object position.

@rat10 rat10 added the ms:CR Milestone: Candidate Recommendation label Jan 17, 2025
@rat10
Copy link
Author

rat10 commented Jan 17, 2025

The issue in question is if triple terms should be only allowed in object position or also in subject position. The Abstract Grammar as defined in the "liberal baseline" allows them in both positions. However, it is an open discussion if allowing them in subject position is harmful in practice. This discussion has so far mainly taken place in the Semantics TF, but it is of general concern, as its outcome determines if a whole class of use cases can be tackled with triple terms, or not.

The discussion is spread over several Github issues, but the most recent thread is triple terms in subject position - issues with RDF/XML?. For a tl;dr just scroll down to get Niklas (2024-12-12T23:12:19.000Z)' and my (2025-01-10T10:42:49.000Z) position.

Please excuse that so far this is a rather sparse description of the issue. I hope to update it with a more thorough treatment and more links after today's Semantics TF meeting. [ Edit: The SemanticsTF seemed to not want to discuss this any further as the issue isn't specifically tied to its work].

This discussion depends on the WG's decision to adopt the "liberal baseline" semantics as suggested by the Semantics TF.

@rat10 rat10 added the spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature label Jan 17, 2025
@TallTed
Copy link
Member

TallTed commented Jan 22, 2025

@rat10

For a tl;dr just scroll down to the last two comments to get Niklas (2024-12-12T23:12:19.000Z)' and my (2025-01-10T10:42:49.000Z) position.

Please confirm that the links and timestamps I've added to your sentence go to the correct comments in that other issue.

(For future reference, note that you can right-click on any comment's "time of post" to copy the link to that comment, and if you inspect that object, you can get the timestamp.)

This will prevent confusion if anyone else adds a comment to that other thread, making the current "last two comments" no longer the "last".

@pchampin
Copy link
Contributor

In my understanding, we have a consensus to recommend that triple-terms SHOULD only be used as the object of rdf:reifies. This is captured in this note in RDF-Concepts.

Allowing triple terms in the subject position would be sending mixed signals, in my opinion...

@afs
Copy link
Contributor

afs commented Jan 22, 2025

In SemTF, as I recall, the agreement was that the semantics abstract grammar would have triple terms (and possibly IIRC literals, which arise in meta-modelling in RDF 1.0/.1.1) in the subject position. This is symmetric RDF.

The RDF Data Model would have triple terms only in the object position, as would concrete syntaxes. This allows for some future time when conditions from RDF Data Model might be relaxed. Removing a feature is more disruptive to deployment data.

@niklasl
Copy link
Contributor

niklasl commented Jan 22, 2025

I agree that the graph data model should not allow triple terms in the subject position.

AFAICS, the only convincing reason for allowing literals and triple terms as subjects is to be able to encode implications of entailment. But these are not supposed to be put back into the graph; they are implied, logically, in the interpretation (such as in the IEXT(p) where p may be an inverse property of e.g. dc:title or rdf:reifies). The informative symmetric RDF is defined, as a way of encoding such implications, for purpose of documentation or in implementation. E.g. by putting implied triples in a symmetric entailment space (if this is even encoded as triples), which might even be serialized in "symmetric turtle" for local inspection. This symmetry is also supported in SPARQL 1.2 (where literals are allowed as subjects since at least 1.1). But that is quite different from encoding RDF documents for publication on the web.

Literals and triple terms are two structural terms in RDF (tuple forms). That is, they identify resources, solely, through their composite structure. These are pragmatic "fixed point" terms used to guarantee well-formed encodings of these two key concepts. They complement the two nominal terms, IRIs and blank nodes, which are used to reference any resources in the domain, about whom we gain more knowledge through more statements. (As such they may of course also denote literal values and propositions; in the event of actually needing that.)

But attempts to use these structural terms as "composite keys" for any other resource is misleading at best; and may very well lead to deep interoperability problems when integrating datasets. (Apart from making it even harder to understand what literal values and propositions are intended for.) That is, if they are used to denote something other than the exact literal value or atomic, logical proposition, respectively, that is a serious conflation.

IMHO, by now, we have extensive practical experience with the age-old problem of identity (e.g. httpRange-14, various notions of abstractions such as in FRBR, etc.). This is an unavoidable problem, critical to tackle if one is to leverage IRIs for shared identities. But I have serious concerns about adding another category of problems to that, unless there are strong practical motivations to do so.

@rat10
Copy link
Author

rat10 commented Jan 22, 2025

@TallTed

@rat10

For a tl;dr just scroll down to the last two comments to get Niklas (2024-12-12T23:12:19.000Z)' and my (2025-01-10T10:42:49.000Z) position.

Please confirm that the links and timestamps I've added to your sentence go to the correct comments in that other issue.

Yes, they do. Thank you for pointing this out and sorry for my sloppiness! I'll correct the comment accordingly.

@rat10
Copy link
Author

rat10 commented Jan 22, 2025

In my understanding, we have a consensus to recommend that triple-terms SHOULD only be used as the object of rdf:reifies. This is captured in this note in RDF-Concepts.

This seemed to be the majority position for a long time (and I agreed for a long time) but there was also a long-standing opposition, so I never thought we really have consensus on this question. @william-vw was most vocal and rather adamant about this issue in the end of 2024, and he was not alone when we discussed this in meetings (Semantics TF telcos IIRC, and maybe also some WG telco?). That made me think about this issue again, and my conclusions are reflected in my comments to that issue, as linked above. Given the general sentiment it is not surprising that RDF-Concepts captures the "in object position only" stance, but that can't be taken as an argument in this discussion.

@rat10
Copy link
Author

rat10 commented Jan 22, 2025

In SemTF, as I recall, the agreement was that the semantics abstract grammar would have triple terms (and possibly IIRC literals, which arise in meta-modelling in RDF 1.0/.1.1) in the subject position. This is symmetric RDF.

I understood some comments in recent discussions as saying that concrete snytaxes might go a different route. That would be one thing, although I wouldn't like e.g. Turtle 1.2 to forbid triple terms in subject position. However, it seems to me to be an entirely more drastic step if RDF Concepts disallows triple terms in subject position. So we no RDF conformant syntax would be allowed to have them in subject position. That IMO goes way too far.

The RDF Data Model would have triple terms only in the object position, as would concrete syntaxes. This allows for some future time when conditions from RDF Data Model might be relaxed. Removing a feature is more disruptive to deployment data.

Removing a feature in RDF is certainly a very difficult if not almost impossible thing to do, at least if it's not outright broken. However, I think practice has shown that loosening restrictions can be just as hard, as the cost of updating code bases and processing chains can be perceived as being too high. Just look at the experience with literals in subject position. With something as basic as dis/allowing a certain type of term in a certain position of a triple the cost can indeed be quite high. So I fear we have to get it right now and can't just defer the decision for some future deliberations.

@afs
Copy link
Contributor

afs commented Jan 22, 2025

loosening restrictions can be just as hard

Not on the web. "The web is not versioned". Old data does not get updated nor go away.

We have a mechanism to add features after the end of the active charter. We do not have a mechanism to remove features. It isn't about code - data on the web does not follow the planned migration cycles that deployed code does.

RDF/XML is used. Use cases for triple terms-as-subject are not substantial (I agree with the observation that it seems to occur as implications of entailment). Given the time available, a last minute, untested change to RDF/XML is a risk.

Time on RDF/XML is better used by considering rdf:ID where we do have evidence of usage.

@william-vw
Copy link

In my understanding, we have a consensus to recommend that triple-terms SHOULD only be used as the object of rdf:reifies. This is captured in this note in RDF-Concepts.

This seemed to be the majority position for a long time (and I agreed for a long time) but there was also a long-standing opposition, so I never thought we really have consensus on this question. @william-vw was most vocal and rather adamant about this issue in the end of 2024, and he was not alone when we discussed this in meetings (Semantics TF telcos IIRC, and maybe also some WG telco?). That made me think about this issue again, and my conclusions are reflected in my comments to that issue, as linked above. Given the general sentiment it is not surprising that RDF-Concepts captures the "in object position only" stance, but that can't be taken as an argument in this discussion.

Thank you for tagging me in this @rat10, as I have not been able to follow the discussions closely (due to a grueling teaching schedule this term). From my recollection there has indeed not been a consensus (at least, not in meetings where I was able to attend).

For instance, I recall one TF meeting in late 2024 where the consensus leaned towards allowing them in the subject position - more generally, allowing usage patterns beyond reification. I believe it was the same meeting where Thomas announced that he dropped his opposition, as the occurrence vs. type problem is much broader than triple terms. In later meetings, I recall more opposition from Enrico and Niklas.

@rat10
Copy link
Author

rat10 commented Jan 22, 2025

loosening restrictions can be just as hard

Not on the web. "The web is not versioned". Old data does not get updated nor go away.

We have a mechanism to add features after the end of the active charter. We do not have a mechanism to remove features. It isn't about code - data on the web does not follow the planned migration cycles that deployed code does.

As I argued above, and I can just repeat myself here, if we forbid triple terms in subject position now we will have very little chance to add them later. Practically none, see e.g. the experience with literals in subject position, or attempts to improve the named graphs machinery. I wouldn't be surprised at all if you were the very first to say "no" to any such endeavor, because of the expenditure it would incur.

RDF/XML is used. Use cases for triple terms-as-subject are not substantial (I agree with the observation that it seems to occur as implications of entailment).

Use cases are substantial, as I outlined in my comment, and I'd like to remind you that, against my vivid opposition at the time, you and the other editors were convinced for years that they are even the all-important use case. Maybe I'll answer separatelyin more detail to Niklas' comment where the "observation" is first stated, or maybe it's just not worth anybody's time. In any case, as long as not the slightest effort is made to comment on the very real examples that I gave, it is hard to qualify this as even an "observation".

Given the time available, a last minute, untested change to RDF/XML is a risk.

I'm not inclined to take eventual problems with RDF/XML as a decisive argument. That notwithstanding we're not talking about a specific syntax here, but about RDF concepts, i.e. the data model. If the data model doesn't support triple terms in subject position, then no syntax can. If the data model allows it, RDF/XML may well not support that (RDF syntaxes differ in their expressiveness).

Time on RDF/XML is better used by considering rdf:ID where we do have evidence of usage.

If we spend time on RDF/XML at all, I might agree w.r.t. RDF/XML, but we are discussing the data model here. And I'm quite averse to making time constraints an argument in questions we just have to decide, one way or the other. This is not an additional feature that we may or may not add, time permitting. This is a decision that will at least be very hard to reverse (if we do not allow them).

@TallTed
Copy link
Member

TallTed commented Jan 23, 2025

@niklasl — "This symmetry is also supported in SPARQL 1.2 (where literals are allowed as subjects since at least 1.1)."

That's inaccurate. "Generalized RDF", which allows literals as subjects, is discussed briefly in 1.1, as a non-normative NOTE. Literals are not generally allowed as subjects in RDF, in any extant version.

I have some worries that we'll get substantial pushback upon CR if we permit literals as subjects in 1.2 as anything other than the still-mostly-ephemeral Generalized RDF 1.1.

@niklasl
Copy link
Contributor

niklasl commented Jan 23, 2025

@niklasl — "This symmetry is also supported in SPARQL 1.2 (where literals are allowed as subjects since at least 1.1)."

That's inaccurate. "Generalized RDF", which allows literals as subjects, is discussed briefly in 1.1, as a non-normative NOTE. Literals are not generally allowed as subjects in RDF, in any extant version.

Of course not in RDF 1.1. I only meant SPARQL 1.1, where this is valid:

prefix : <http://example.org/ns#>
select * where { "l" ^:p ?s }

In fact, even this is:

prefix : <http://example.org/ns#>
insert { "l" :invp ?s } where { ?s :p "l" }

(See uses of VarOrTerm and what it permits. Also verified at sparql.org and with RDFLib.)

There is an open issue about catagorizing test cases for symmetric or generalized RDF to make this clearer.

I have some worries that we'll get substantial pushback upon CR if we permit literals as subjects in 1.2 as anything other than the still-mostly-ephemeral Generalized RDF 1.1.

I wholly agree. My argument is that this should not be allowed, other than in the non-normative symmetric and generalized extensions of RDF. Neither for literals nor for triple terms. (But I also note that the case may be different for SPARQL, to cater for entailed triples.)

@afs
Copy link
Contributor

afs commented Jan 23, 2025

See the quote in https://www.w3.org/TR/sparql12-query/#sparqlTriplePatterns

SPARQL has a more general subject slot because of variables. This is SPARQL 1.0, before property paths.

{ :s :p ?Z. 
  ?Z :q :w }

It can't be a syntactic restriction.

(Predicates in syntax and in the algebra are restricted to IRI or variable, that was considered "helpful" - YMMV - but the variable there is not restricted.)

Even at 1.0, the idea of putting in an evaluation condition at every point to forbid literals as subjects, literals as predicates was daunting/unhelpful and maybe impossible. Instead, the spec notes that they don't match in triple patterns and CONSTRUCT rejects non-triples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ms:CR Milestone: Candidate Recommendation spec:substantive Change in the spec affecting its normative content (class 3) –see also spec:bug, spec:new-feature
Projects
None yet
Development

No branches or pull requests

6 participants