Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node Expressions : base definition and evaluation #322

Open
afs opened this issue Mar 12, 2025 · 18 comments
Open

Node Expressions : base definition and evaluation #322

afs opened this issue Mar 12, 2025 · 18 comments
Assignees
Labels
Core For SHACL 1.2 Core spec Needs Group Input/Decision Node Expressions For SHACL 1.2 Node Expressions
Milestone

Comments

@afs
Copy link
Contributor

afs commented Mar 12, 2025

The current (2025-03-12) definition of Node Expression in SHACL Core reads as:

Each node expression function has an IRI as its function name.
Node expression functions can declare one or more node expression parameters.
Each of these parameters has an IRI.

as if the only way to invoke a function is by its function name which isn't true (IRI expression, literal expression).

Alternative:

A node expression is one of

  • A function called by its function name
  • An IRI expression
  • A literal expression

and the definitions of "IRI expression" and "literal expression" need changing (they can be used as direct terms or via their function name).

Evaluation is defined as eval(expr, activeGraph, scope)

How to paramaters and positional arguments e.g. [ fn:strlen ( someWayToGetInputFocusNode 2 3 ) ]
get passed to the evaluation step?

  1. What goes in scope?
  2. Does default properties modify the active graph during evaluation?
  3. Positional arguments are not in the node expression definition.
@afs afs changed the title Node Expressions : definition and evaluation Node Expressions : base definition and evaluation Mar 12, 2025
@afs afs added Core For SHACL 1.2 Core spec Needs Group Input/Decision Node Expressions For SHACL 1.2 Node Expressions labels Mar 12, 2025
@HolgerKnublauch
Copy link
Contributor

We should also revisit the name "function". I think it would be better to reserve function to how the term is used in SPARQL, for a single function IRI with an ordered list of parameters. I think for the more general concept, the term "type" isn't that bad after all, because some node expressions COULD use it as rdf:type.

@afs
Copy link
Contributor Author

afs commented Mar 12, 2025

That would be better as "SPARQL function". The word "function" is the mathematical concept of mapping from domain to range, and which implies the arguments are evaluated before the function call is made to get elements in the domain.

There are also "functional forms" - naming here is more varied; SPARQL uses "functional form".
e.g. ||, IF or BOUND(?x). Arguments are not evaluated before calling c.f. macros.

Functions are completely defined by their arguments. Some things look like "functions" but aren't e.g. BNODE, UUID, EXISTS.

@HolgerKnublauch
Copy link
Contributor

Ok taking this further, what about these kinds of node expressions

  • Constants (literals, IRIs, triple terms?)
  • Functional expressions (blank node with a single predicate, which is the function IRI that points to the list of arguments). Functions can only ever output a single node. I don't think it's necessary to say that all arguments are evaluated beforehand, for example a COALESCE could evaluate on demand.
  • Non-functional expressions (blank node with a key parameter and other optional params). Can produce any list of output nodes.

Maybe there is a better term than non-functional, but then, why not.

Each node expression has a type, which for functions is the function IRI, for non-functions is the optional rdf:type.

@HolgerKnublauch HolgerKnublauch self-assigned this Mar 12, 2025
@tpluscode
Copy link
Contributor

We should also revisit the name "function"

FWIW, I do like the term function. It may not match it's mathematical definition exactly but in a programming sense they all have some inputs and produce nodes as output, correct? Pretty functional to me.

Yes, in triples an IRI or literal are valid expressions but that is only their representation. Their implementation is just as functional, being () => IRI or () => Literal, respectively.

@afs
Copy link
Contributor Author

afs commented Mar 13, 2025

Constants can be considered as functions of zero arguments.
But if the syntax has them in natural form (which is the case at the moment and is the bridge to SHACL 1.0), that has to be put in specially.

@afs
Copy link
Contributor Author

afs commented Mar 13, 2025

  • Functions .... for example a COALESCE could evaluate on demand.

COALESCE is not a function.
There's a list: https://www.w3.org/TR/sparql12-query/#func-forms

@afs
Copy link
Contributor Author

afs commented Mar 13, 2025

Ok taking this further, what about these kinds of node expressions

  • Constants (literals, IRIs, triple terms?)
  • Functional expressions (blank node with a single predicate, which is the function IRI that points to the list of arguments). Functions can only ever output a single node. I don't think it's necessary to say that all arguments are evaluated beforehand, for example a COALESCE could evaluate on demand.
  • Non-functional expressions (blank node with a key parameter and other optional params). Can produce any list of output nodes.

Maybe there is a better term than non-functional, but then, why not.

Each node expression has a type, which for functions is the function IRI, for non-functions is the optional rdf:type.

It's probably better to have a special name for the function/one-return case. It's not necessary - make "tuple of list of RDF terms" the domain of functions; if one value is expected, it must be a list of one input - but that doesn't say how list-returns interact with use as NE expression arguments.

function ( NE-list1 NE-list2 )

Is that a cross-product? Pairing? Not allowed unless the function declares it takes 2 lists?

  • Constants - these are called out because they have special syntax (it may avoid the need for a special quoting operator)
  • Term expressions (full name "term-valued expressions"), commonly just "expressions" - single output node
  • List expressions (full name "list-valued expressions") - list of output nodes
  • Aggregators (COUNT et al) - their argument is a list of nodes, (maybe a list of list of nodes if there is a GROUP_CONCAT like thing).

Functional forms look (syntax) like functions. They evaluate differently.
There could be a spec-fixed list of these (is there a UC for custom functional forms?) so that general evaluators can be written. Then the custom signatures only define custom term expressions.

@afs
Copy link
Contributor Author

afs commented Mar 13, 2025

(Putting this comment in so all Node Expression issues get mentioned on this discussion)

#311 (comment)
@ajnelson-nist mentioned sh:bind

@nicholascar nicholascar added this to the Phase 1 milestone Mar 17, 2025
@HolgerKnublauch
Copy link
Contributor

Some of us (Tom, Andy and I) will meet tomorrow at 12:00 Paris time to discuss various topics related to Node Expressions. Anyone from the WG is invited to join if interested. We will in any case report on the outcome, hopefully through some PR that the WG can review.

SHACL Node Expressions
Thursday, March 20 · 12:00 – 13:00
Time zone: Europe/Rome
Google Meet joining info
Video call link: https://meet.google.com/ddz-birz-hkt
Or dial: ‪(FR) +33 1 73 08 31 91‬ PIN: ‪921 599 371 2331‬#
More phone numbers: https://tel.meet/ddz-birz-hkt?pin=9215993712331


General discussion on SHACL Node Expressions including

  • Naming of concepts such as functions
  • Node expression kinds (blank nodes with either a function-style triple or parameters)
  • Parameterized node expression templates

https://github.com/w3c/data-shapes/issues/322
https://github.com/w3c/data-shapes/issues/314
https://github.com/w3c/data-shapes/issues/279
https://github.com/w3c/data-shapes/discussions/316

@HolgerKnublauch
Copy link
Contributor

Some (raw) Notes from the Node Expressions meeting on March 20.

Present: @afs @simonstey @tpluscode David Habgood, @HolgerKnublauch

We agreed to keep the division of blank node syntaxes for Node Expressions. Names of these two kinds may change:

  1. Named Parameter Expressions. sh:NamedParameterExpression
[
	a sh:SelectExpression ;
	sh:select “…” ;
	sh:prefixes ...
]

[
	sh:max [
		sh:select “” 
	]
]

Positional Argument Expression. sh:PositionalArgumentExpression

[
	ex:strlen ( “Aldi" )
]

[
	ex:strlen ( [ 
		sh:select “” ;
	] )
]

We brainstormed about how shapes graph could define new node expression functions:

ex:multiply
	a sh:PositionalArgumentExpression ;
	sh:argument [  # Probably needs a different term
		sh:varName “op1” ;   # Use varName instead of sh:path
		sh:datatype xsd:decimal ;
		sh:order 0 ;
	] ;
	sh:argument [
		sh:varName “op1” ;
		sh:datatype xsd:decimal ;
		sh:order 1 ;
	] ;
	sh:expression [
		sh:eval “$op1 * $op2” ;
	] ;
	sh:returnType xsd:decimal .

ex:Shape sh:targetNode [ ex:multiply ( 40  2 ) ]

For named parameter expressions more work is needed:

ex:MultiplyList
	a sh:NamedParameterExpression ;
	sh:parameter [
		sh:path ex:list ;
	] ;
	sh:parameter [
		sh:path ex:factor ;
		sh:datatype xsd:decimal ;
	] ;
	sh:expression [
		// How to allow these to query the arguments if they are node expressions/lists
	] .

ex:Shape sh:targetNode [ 
	ex:MultiplyList [
		ex:list [ sh:select “…" ] ;
		ex:factor 2 ;
	]
]

An interesting suggestion was that we could put all these declarations into the Node Expression document itself, without delaying SHACL Core or SPARQL. All we really need to say in Core is that Constant expressions exist and that the handling of blank nodes will be defined by the Node Expr document.

@HolgerKnublauch
Copy link
Contributor

On further iteration on the names, what about

  • sh:NamedParameterExpression
  • sh:ListParameterExpression

and then use sh:parameter and sh:listParameter to declare them? This avoids the term "argument", and "list" describes what is going on in the syntax quite well?

Also, I have some concerns about introducing the intermediate predicate such as ex:MultiplyList in the example at the bottom of the post above. The problem here is that this syntax would look different from the built-in named parameter expressions. Over time, it is quite possible that many people will publish new libraries of node expressions, e.g. based on SPARQL design patterns for specific ontologies such as SKOS. It should be possible to use those consistently with the built-ins. So I suggest that the user-defined expression types also declare a key parameter.

@afs
Copy link
Contributor Author

afs commented Mar 24, 2025

On further iteration on the names, what about

  • sh:NamedParameterExpression
  • sh:ListParameterExpression

Looks good - or "...ParamExpr" if the user has to write the things quite frequently.

@afs
Copy link
Contributor Author

afs commented Mar 24, 2025

So I suggest that the user-defined expression types also declare a key parameter.

Could you show an example or two?

@tpluscode
Copy link
Contributor

if the user has to write the things quite frequently.

Thinking about it now, I don't expect users will have to write that at all. In fact, it would be best if they were completely optional. like sh:PropertyShape.

They are already unambiguous by using the sh:parameter vs sh:argument although I think this is unclear and will be confusing.

I stand by defining parameters according to the type of expression (arguments being the actual values, as appears to be widely accepted).

Positional: rdf:List, which would not require sh:order
Named: as above

ex:multiply
	a sh:PositionalArgumentExpression ; # optional
-	sh:argument [  # Probably needs a different term
+	sh:parameters ( [  # Might use plural
		sh:varName “op1” ;   # Use varName instead of sh:path
		sh:datatype xsd:decimal ;
-		sh:order 0 ;
-	] ;
-	sh:argument [
+	]
+	[
		sh:varName “op1” ;
		sh:datatype xsd:decimal ;
-		sh:order 1 ;
-	] ;
+	] );
	sh:expression [
		sh:eval “$op1 * $op2” ;
	] ;
	sh:returnType xsd:decimal .

ex:Shape sh:targetNode [ ex:multiply ( 40  2 ) ]

@HolgerKnublauch
Copy link
Contributor

The new PR leaves the remaining details to the Node Expr document, including the syntax for user-defined node expression types. The only thing we may need to adjust is the exprEval function, but let's keep Core lean so that this isn't blocking progress and we remain flexible.

@afs
Copy link
Contributor Author

afs commented Mar 30, 2025

The new PR leaves the remaining details to the Node Expr document, including the syntax for user-defined node expression types.

@simonstey , @robert-david , @recalcitrantsupplant -- what do you think of this?

@simonstey
Copy link
Contributor

I think that's a good thing as we wouldn't have to rush a half-baked spec. for NExpr out fast for phase 1, but can flesh it out in phase 2.

@recalcitrantsupplant
Copy link

I think it's sensible. I like the direction of the changes so far but would like more time to play with examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core For SHACL 1.2 Core spec Needs Group Input/Decision Node Expressions For SHACL 1.2 Node Expressions
Projects
None yet
Development

No branches or pull requests

6 participants