Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [rdf4j-dev] sh:targetShape discussion on SHACL mailing list

Thanks Havard,

I haven't followed this in detail but my understanding of the discussion is that your proposed approach is actually less complex/expressive than having a full SPARQL query as the target specification. So I find it rather baffling that other SHACL devs would worry about the performance of this, and shoot it down on those grounds.

This sounds like one of those cases where it needs to be done in practice first and demonstrated as feasible before convergence can be reached. That's the polite way of saying: you've done your part in trying to build consensus, so I'm absolutely fine with us now just rolling our own.

Do you have any thoughts on how you want to document/publish this?

Cheers,

Jeen

On Thu, Jul 9, 2020, at 18:03, Håvard Ottestad wrote:
Hi everyone,

There has been some discussions about a new feature that RDF4J (and Ontotext) have proposed for the SHACL AF Note (SHACL Advanced Features W3C Working Group Note eg. not the actual SHACL standard).

The feature allows users to much more precisely specify the targets that they want to validate with a constraint. Targets are the SHACL way of saying which nodes in the RDF graph should be validated against a constraint. Up until now users have only been able to use the following targets:

 - list of nodes
 - range
 - domain
 - instances of class

SHACL AF Note allows users to also specify SPARQL queries for targets, which adds a lot more flexibility and precision.

Our proposal was to complement SPARQL queries with the use of SHACL Shapes as target descriptions.

Example:

# 1. Norwegians must have exactly one norwegianID
ex:NorwegianShape a sh:NodeShape;
  sh:targetShape [sh:path ex:nationality; sh:hasValue ex:Norway];
  sh:property [sh:path ex:norwegianID; sh:minCount 1; sh:maxCount 1];
.

The biggest benefit for this approach is that our SHACL implementation is in charge of deciding how to use the target shape to retrieve the targets. This means that our transactional SHACL validation is able to analyse the changes in the transaction and only validate the subset of data that would be affected by said transaction.

SPARQL targets on the other hand are much harder to handle. To be able to do transactional validation we would have to implement a new SPARQL query engine that can run a query against both a database and a transaction to efficiently retrieve only that results that have now been modified or added. Which I think is infeasible for us to do.

Our SHACL implementation, ShaclSail, is unique in offering transactional validation. We scale with the size of the transaction, instead of the size of the database. A small transaction on a large database takes the same amount of time to validate on a database with 100 triples as it would with 100 million triples. This means that we are the only implementation that can truly scale for users that have a transactional workload.

As this email is getting fairly long I'll bring it to the point.

Holger (from Topquadrant) does not want to approve our proposal because he is unable to achieve adequate performance with sh:targetShape. This is not due to an inherent complexity with target shapes that limits its performance as I have countered each of his slow examples with a fast and efficient implementation. 

Nevertheless we are now at a crossroads. No one from the SHACL community has voiced their support for our proposal, so it will not be added to the SHACL AF Note. Our only option, as I see it, is to create our own SHACL extension (much like Topquadrant has done with their own extension DASH). 

Cheers,
Håvard


_______________________________________________
rdf4j-dev mailing list
rdf4j-dev@xxxxxxxxxxx
To unsubscribe from this list, visit https://www.eclipse.org/mailman/listinfo/rdf4j-dev



Back to the top