Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

531 extend symbol predicates #997

Merged
merged 15 commits into from
Nov 8, 2023
Merged

Conversation

uscholdm
Copy link
Contributor

@uscholdm uscholdm commented Oct 29, 2023

Fixes #531 .

  • Deprecated predicates: gist:unitSymbol, gist:unitSymbolHtml and gist:unitSymbolUnicode.
  • Added predicates: gist:symbol, gist:symbolHtml and gist:symbolUnicode.
  • Added symbol triples for existing units, anticipating the removal of gist:UnitSymbol.
  • Added nonConformingLabel triple for symbolUnicode in gistValidationAnnotations.ttl

Copy link
Collaborator

@rjyounes rjyounes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've suggested what I think are more precise definitions; they could still use more work. We should also settle on whether to put letters and symbols in single quotes or not in the annotations, and apply this consistently.

@@ -0,0 +1,5 @@
### Minor Updates

- Deprecated `gist:unitSymbol`, `gist:unitSymbolHtml` and `gist:unitSymbolUnicode`. Issue [#531](https://github.com/semanticarts/gist/issues/531).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are all one issue and a set of interrelated changes, there should be one primary bullet point and sub-bullets.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea.

uscholdm and others added 7 commits October 30, 2023 09:24
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Co-authored-by: Rebecca Younes <rebecca.younes@semanticarts.com>
Copy link
Contributor Author

@uscholdm uscholdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommended changes made.

@uscholdm uscholdm requested a review from rjyounes October 30, 2023 13:47
@uscholdm
Copy link
Contributor Author

uscholdm commented Oct 30, 2023

@rjyounes There is a failed check that I cannot track down. I looked in the ontologyShapes.ttl and property_type_construct.rq files and found no message text that matched. The property, symbolUnicode is very much like symbol and symbolHtml. Could it be tripping over the Unicode symbols? That's the only thing that seems different enough to cause a problem.

image

Copy link
Collaborator

@rjyounes rjyounes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that the rdfs:domain assertions didn't get removed from the new predicates.

gistCore.ttl Outdated

gist:symbolHtml
a owl:DatatypeProperty ;
rdfs:domain gist:UnitOfMeasure ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rdfs:domain gist:UnitOfMeasure ;

gistCore.ttl Outdated

gist:symbolUnicode
a owl:DatatypeProperty ;
rdfs:domain gist:UnitOfMeasure ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rdfs:domain gist:UnitOfMeasure ;

gistCore.ttl Outdated
@@ -3840,9 +3852,40 @@ gist:startDateTime
;
.

gist:symbol
a owl:DatatypeProperty ;
rdfs:domain gist:UnitOfMeasure ;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rdfs:domain gist:UnitOfMeasure ;

@rjyounes
Copy link
Collaborator

@uscholdm The new symbol predicates don't conform to this shape:

gshapes:PropertyShape
	a sh:NodeShape ;
	skos:prefLabel "Property Shape"^^xsd:string ;
	sh:or (
		gshapes:LowerCase
		gshapes:NonConformingLabel
	) ;
	sh:property gshapes:MandatoryDefinition ;
	sh:targetClass
		owl:AnnotationProperty ,
		owl:DatatypeProperty ,
		owl:ObjectProperty
		;
	.

You should put these into gistValidationAnnotations.ttl file - you'll see an example there.

BTW, I also noticed that the rdfs:domain assertions haven't yet been removed from the new predicates.

@uscholdm
Copy link
Contributor Author

uscholdm commented Oct 30, 2023

The new symbol predicates don't conform to this shape:

gshapes:PropertyShape
	a sh:NodeShape ;
	skos:prefLabel "Property Shape"^^xsd:string ;
	sh:or (
		gshapes:LowerCase
		gshapes:NonConformingLabel
	) ;
	sh:property gshapes:MandatoryDefinition ;
	sh:targetClass
		owl:AnnotationProperty ,
		owl:DatatypeProperty ,
		owl:ObjectProperty
		;
	.

The property mentioned in the error message as being invalid is:

gist:symbolUnicode
	a owl:DatatypeProperty ;
	rdfs:range xsd:string ;
	skos:definition "The Unicode symbol for something."^^xsd:string ;
	skos:example "For square meter (ASCII 'm^2'), the Unicode symbol for 'm' followed by the Unicode symbol for superscript 2, i.e., 'U+006D U+00B2'"^^xsd:string ;
	skos:prefLabel "symbol Unicode"^^xsd:string ;
	gist:domainIncludes gist:UnitOfMeasure ;
	.

The label is not non-conforming. I tested it against the regex for gshapes:LowerCase at Regex101 (see below). So, the sh:or clause is obeyed. There is also a skos:definition, so gshapes:MandatoryDefinition is obeyed. I am still not seeing the problem.

image

@rjyounes
Copy link
Collaborator

rjyounes commented Oct 30, 2023

The regex in the gshapes:LowerCase constraint is:

^([a-z]+|[A-Z][A-Z]+)([- ]([a-z]+|[A-Z][A-Z]+|[0-9]+))*$

^ - beginning of string
[a-z]+| - 1 or more lowercase letters or
[A-Z][A-Z]+ - 1 or more uppercase letters (I.e., an acronym like UNRWA)

([- ]([a-z]+|[A-Z][A-Z]+|[0-9]+))*$ =
0 or more of: (space or hyphen plus (1 or more lowercase letters or 1 or more uppercase letters or one or more digits))
This is meant to accommodate things like "ISBN-11"

$ - end of string

"symbol Unicode" doesn't conform. "symbol HTML" does conform.

In your regex tool, doesn't "no match" mean the input string does not match the regex? Which is the result we expect.

@uscholdm
Copy link
Contributor Author

Thanks for that explanation. I misread the regex. Is the regex really what we want? Why rule out capitalized words, they are not that unusual.

@uscholdm uscholdm requested a review from rjyounes October 31, 2023 02:17
@rjyounes
Copy link
Collaborator

rjyounes commented Nov 7, 2023

@uscholdm The PR has been approved but has conflicts with develop that must be resolved before merging.

@uscholdm uscholdm requested a review from rjyounes November 8, 2023 01:57
@rjyounes rjyounes merged commit 82c787c into develop Nov 8, 2023
@rjyounes rjyounes deleted the 531-extend-symbol-predicates branch November 8, 2023 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend unit of measure symbol predicates to things other than units of measure
2 participants