Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ambiguous behaviour of additionalProperties when invalid #1172

Closed
ItsVeryWindy opened this issue Jan 13, 2022 · 38 comments · Fixed by #1203
Closed

ambiguous behaviour of additionalProperties when invalid #1172

ItsVeryWindy opened this issue Jan 13, 2022 · 38 comments · Fixed by #1203

Comments

@ItsVeryWindy
Copy link

I was testing a schema and getting an additionalProperties error that didn't make much sense to me as the name of the property was in the schema.

So I put together a smaller schema to reproduce it.

{
    "type": "object",
    "properties": {
        "test": {
            "type": "string"
        }
    },
    "additionalProperties": false
}

I tried testing it against this piece of json.

{
  "test": 1
}

I tried it against a few of the online validators and depending on which implementation I tried the errors that I would get are different, sometimes I would just get the error about test not being a string, other times I would also get an error about no additional properties.

Reading http://json-schema.org/understanding-json-schema/reference/object.html#additional-properties

Specifically around properties whose names are not listed in the properties keyword or match any of the regular expressions in the patternProperties keyword implies that only the property name is important, not if it's valid or not.

I know that some of these implementations use annotations in order to determine if a property is additional or not, and those annotations will get dropped if any properties are invalid as referenced here, which is probably why I'm seeing what I am.

#939

Is there any expectations as to how additionalProperties is meant to work in these instances?

@karenetheridge
Copy link
Member

karenetheridge commented Jan 13, 2022

That sounds like the issue raised in json-schema-org/community#57 and addressed for the next interim release in #1154.

@Relequestual
Copy link
Member

I’m wondering if we read json-schema-org/community#57 incorrectly.

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7.7.1.2

The annotation result of [the properties] keyword is the set of instance property names matched by this keyword.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.1

In addition to possibly defining annotation results of their own, applicator keywords aggregate the annotations collected in their subschema(s) or referenced schema(s).
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7.7.1.3

I read this as, the properties keyword makes its own annotations, which are not effected by the assertion results of the schema values of its object. The same is true for patternProperties.

@jviotti
Copy link
Member

jviotti commented Feb 24, 2022

@Relequestual If I understand your point correctly, you are saying that properties should emit annotations that additionalProperties can use for its own behavior even if some of the subschemas do not match?

Consider the following schema:

{
    "properties": {
        "foo": true,
        "bar": false
    }
}

That schema has the following subschemas:

  • #: The top schema
  • #/properties/foo: Always produces a true assertion
  • #/properties/bar: Always produces a false assertion

So at this point, based on the following:

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7.7.1.2

We infer that #/properties/foo produces an annotation result, but #/properties/bar does not. Whether # produces an annotation result depends on the input. For example:

  • {"foo":1}: The schema # validates successfully and the properties annotation is [ "foo" ]
  • {}: The schema # validates successfully and the properties annotation is []
  • {"bar":1}: The schema # DOES NOT validate successfully, so no annotations are emitted

Following the same logic:

  • {"foo": 1, "bar":1}: The schema # DOES NOT validate successfully, so no annotations are emitted

So one way to read this:

In addition to possibly defining annotation results of their own, applicator keywords aggregate the annotations collected in their subschema(s) or referenced schema(s).
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7.7.1.3

is the following: applicators may define extra annotations than the sum of their subschemas, but must emit nothing if the applicator does not validate successfully.

However, what about not? not is an applicator that by definition emits annotations DESPITE its subschema not validating successfully, which seems to violate the rule that says that "Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas."

Therefore, an applicator seems to have a way to aggregate annotations from their subschemas, even if some fail. It follows that properties could have a behavior that aggregates annotations despite some of their subschemas failing.

However, there is still a key difference between not and properties: not succeeds when it's subschema fails, while properties only succeeds if ALL its subschemas are successful.

@jviotti
Copy link
Member

jviotti commented Feb 24, 2022

I think oneOf is another interesting case to consider, as it is an applicator that emits annotations when one of its subschemas passes while the rest fail. anyOf also emits annotations when at least one of its subschemas pass while others may not.

@jviotti
Copy link
Member

jviotti commented Feb 24, 2022

@Relequestual Actually, I think the issue reported here is valid.

I read this as, the properties keyword makes its own annotations, which are not effected by the assertion results of the schema values of its object.

The properties keyword makes its own annotations that are not affected by the assertion results of its subschemas, but properties itself expresses an assertion, and so properties can't emit annotations at all if its own assertion failed, independently of its sub assertions.

@Relequestual
Copy link
Member

The properties keyword makes its own annotations that are not affected by the assertion results of its subschemas, but properties itself expresses an assertion, and so properties can't emit annotations at all if its own assertion failed, independently of its sub assertions.

properties is an applicator keyword. It does not produce any assertions. The assertions come from applying the schemas which are the values. properties just defines HOW to apply those schemas, if at all.

So, again I come back to...

Schema objects that produce a false assertion result MUST NOT produce any annotation results...

In the defintiion of properties...

"Validation succeeds if, for each name that appears in both the instance and as a name within this keyword's value, the child instance for that name successfully validates against the corresponding schema."

In contrast, here is allOf...

"An instance validates successfully against this keyword if it validates successfully against all schemas defined by this keyword's value."

It may seem like clutching at straws, and you COULD argue that properties combines the assertion results of the applications with logical AND, but I've never thought of it that way.

karenetheridge added a commit to karenetheridge/JSON-Schema-Modern that referenced this issue Feb 25, 2022
karenetheridge added a commit to karenetheridge/JSON-Schema-Modern that referenced this issue Feb 25, 2022
@jviotti
Copy link
Member

jviotti commented Feb 25, 2022

properties is an applicator keyword. It does not produce any assertions. The assertions come from applying the schemas which are the values. properties just defines HOW to apply those schemas, if at all.

This is what I'm a bit confused about. The spec says:

JSON Schema keywords fall into several general behavior categories. Assertions validate that an instance satisfies constraints, producing a boolean result. Annotations attach information that applications may use in any way they see fit. Applicators apply subschemas to parts of the instance and combine their results.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7

The properties keyword is an applicator that combines the assertions produced by its subschemas. The result of properties is still a boolean result, which based on the definition above, means that properties is also an assertion?

Furthermore, the section that defines properties says this:

The value of "properties" MUST be an object. Each value of this object MUST be a valid JSON Schema.
Validation succeeds if, for each name that appears in both the instance and as a name within this keyword's value, the child instance for that name successfully validates against the corresponding schema.
The annotation result of this keyword is the set of instance property names matched by this keyword.
Omitting this keyword has the same assertion behavior as an empty object.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.2.1

The above section mentions "validation succeeds if" (making it also sound like an assertion), and the final line says "Omitting this keyword has the same assertion behavior as an empty object" which implies that the presence of properties results in a different assertion, also making it sound like properties is an assertion?

I understand your point that properties just combines subschemas (in which case your premise about the current correctness of the spec holds), but I think there is a weird overlap between assertions and annotations that maybe is not well defined and leads to the confusion. There are many keywords that are "pure" annotations and do not interact with validation at all (like description) and "pure" assertions that evaluate to booleans such as type. However, keywords like properties and oneOf also evaluate to boolean results (so we can argue they perform validation/assertion) but still should not inherit the characteristics of other assertion keywords?

@gregsdennis
Copy link
Member

gregsdennis commented Feb 25, 2022

properties is an applicator keyword. It does not produce any assertions. - @Relequestual

I think the categorization of a keyword into a single bucket is wrong. I address this in my opening comment on json-schema-org/community#63 (section on annotations).

@jviotti
Copy link
Member

jviotti commented Feb 25, 2022

I think the spec explicitly mentions that keywords can fall within multiple categories:

Object properties that are applied to the instance are called keywords, or schema keywords. Broadly speaking, keywords fall into one of five categories:
...
Keywords may fall into multiple categories
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.4.3.1

@jdesrosiers
Copy link
Member

jdesrosiers commented Feb 27, 2022

properties is an applicator keyword. It does not produce any assertions.

I'm sorry @Relequestual, but that doesn't make sense. Applicators assert on the result of applying one or more schemas. For properties, if all schemas are applied successfully, then the properties assertion is true. Without that, I don't see how your new interpretation makes sense.

Here's an example. Let me know if I understand your interpretation correctly.

{
  "type": "object",
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "additionalProperties": false
}
{
  "foo": 42,
  "bar": "hello"
}

This schema has three keywords at the top level: type, properties, and additionalProperties. type evaluates to true. properties applies the "foo" schema and finds it false. properties asserts false, but produces annotation ["bar"]. Assuming annotation-based evaluation is used, additionalProperties then determines that "foo" is additional based on the properties annotation and asserts false. This makes one passing keyword and two failing keywords, so the schema asserts false and doesn't pass along any of the annotations it's keyword's produced.

Under the current interpretation, both "foo" and "bar" would both be considered additional instead of just "foo" because properties wouldn't produce any annotations. The end result is that same, one passing keyword and two failing ones, but the additionalProperties would report a little differently.

Under the draft-07 interpretation, nothing is considered additional and additionalProperties passes. The spec says that both annotation-based and draft-07 evaluation are valid, but annotation-based evaluation (both interpretations) produces two errors while draft-07 evaluation produces one error. The end result is the same, but there are two possible output results. If that's something we care about, this part of the spec will need an update of some kind even if we adopt the new interpretation.

I still have to think a little more about how the new interpretation would affect unevaluatedProperties.

@gregsdennis
Copy link
Member

gregsdennis commented Feb 27, 2022

properties asserts false, but produces annotation ["bar"] - @jdesrosiers

I think under the new interpretation, it would report foo as "evaluated" as well (even though its subschema failed, it was evaluated), so additionalProperties would still not see any properties, just like draft 7.

properties would still assert false.

@jviotti
Copy link
Member

jviotti commented Feb 27, 2022

@jdesrosiers

properties asserts false, but produces annotation ["bar"].

From what I understand from this thread, if we agree that properties is both an applicator and an assertion, then properties cannot produce any annotation while asserting false:

Schema objects that produce a false assertion result MUST NOT produce any annotation results, whether from their own keywords or from keywords in subschemas.
https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.7.7.1.2

@gregsdennis
Copy link
Member

gregsdennis commented Feb 27, 2022

With the new interpretation, all keywords produce annotations (the ones that produce them, anyway), whether asserting true or false, however a schema (specifically objects schema because boolean schemas don't annotate) drop annotations on validation failure. So the (sub)schema that contains properties wouldn't pass up annotations, but properties still does.

@karenetheridge
Copy link
Member

karenetheridge commented Feb 27, 2022

Edited transcription of my analysis and discoveries from slack:

At first I thought @Relequestual was wrong but now I'm not so sure :)

If this interpretation is (to be) correct, then additionalProperties behaves the same whether annotations are supported or not, at least.. and unevaluatedProperties will no longer evaluate at properties that were evaluated (but with an error in the subschema) via adjacent properties and patternProperties - which was the other part of my "I don't like how annotations work here" complaint, but required a change in behaviour for the next draft.
Overall, I think I prefer this interpretation. it is more consistent.

unevaluatedProperties will still evaluate at unsuccessful properties from other subschemas, however, which we can't avoid unless we redefine unevaluatedProperties to use something else other than annotation collection (I played for this a bit over the winter holidays and I failed to come up with anything not-horrible).

This would now mean that this:

{
  "properties": {
    "foo": false
  },
  "unevaluatedProperties": false
}

would now only produce one error instead of two. which is different than if we did:

{
  "$defs": {
    "foo": {
      "properties": {
        "foo": false
      }
    }
  },
  "$ref": "#/$defs/foo",
  "unevaluatedProperties": false
}

.. which is not a bad thing really - a reference being replaced with the subschema it represents
doesn't have to behave identically, it just has to produce the same pass/fail results. And this largely takes care of my complaint about unevaluatedProperties being really noisy (and given the way the OpenAPI schema is written, errors from it should be much much quieter).

I ran the acceptance tests against updated code -- there is one change in behaviour, at unevaluatedItems.json: "unevaluatedItems depends on adjacent contains" - "contains passes, second item is not evaluated". This is where there is a "contains" keyword that does not validate against an item. the test is written presuming that unevaluatedItems will evaluate at this location (and fail); if we change the use of an adjacent contains to prevent unevaluatedItems from doing anything, then the overall validation result is different.

An equivalent test for "unevaluatedProperties" isn't possible because there is no equivalent for "contains" -- if "properties" produced a true result, then it annotates, and unevaluatedProperties would not evaluate there; if it produced a false result, the overall result is still false whether or not unevaluatedProperties does anything. Once again "contains" is a problem child! :)

The diff is here (it's small!) -- https://github.com/karenetheridge/JSON-Schema-Modern/compare/ether/unevaluated-respects-local-evaluation?expand=1

..later..

hmm, actually that's only part of the necessary change. I'm not yet accounting for annotations from adjacent applicators that failed overall but still produced some annotations (and all applicators need to be updated here - allOf,anyOf,oneOf, not (!!!), too.

And making this change actually simplifies a lot of code. e.g. previously in "allOf", I had to save the new annotations (from subschemas) to a different list and then decide whether to append them onto the master list or not. now I can just append them directly, and let the next higher scope decide whether to copy what it received and append to its master list

..later..

ok, second commit added, that was straightforward.

I'm still in favour of this change. even though it requires changing one test in the acceptance suite 🙂 the test suite is not authoritative; the spec is; we have found and fixed errors in the tests before.

@gregsdennis
Copy link
Member

gregsdennis commented Feb 27, 2022

I think the change to unevaluated* is what Henry meant when he would say that it can "look inside" adjacent subschemas. This way, it kind of acts as if it's still receiving annotations through the schema boundary.

(Grasping at straws here)

@jdesrosiers
Copy link
Member

jdesrosiers commented Feb 27, 2022

@gregsdennis

I think under the new interpretation, it would report foo as "evaluated" as well (even though its subschema failed, it was evaluated), so additionalProperties would still not see any properties, just like draft 7.

Why? The spec is very clear that "evaluated" means "successfully evaluated".

The spec says that the annotation result for properties is the property names that were "matched". It's not entirely clear what "matched" means, but unevaluatedProperties is clear that it deals with not-successfully-evaluated properties. The only way that definition can hold with the new interpretation is if "matched" only includes the successfully evaluated properties.

@jviotti

From what I understand from this thread, if we agree that properties is both an applicator and an assertion, then properties cannot produce any annotation while asserting false

Either you're missing the point or I am. As I understand it, in the current interpretation, if the keyword fails, it doesn't produce annotations. The new interpretation is that the keyword can produce annotations even if it fails, but the schema that keyword is part of won't pass those annotations along because the schema fails. That means keywords such as additionalProperties can use the annotations provided by failing properties, but those annotations won't be passed further along the chain.

Consider these two schemas.

{
  "properties": {
    "foo": { "type": "string" },
    "bar": { "type": "string" }
  },
  "unevaluatedProperties": false
}
{
  "allOf": [
    {
      "properties": {
        "foo": { "type": "string" },
        "bar": { "type": "string" }
      }
    }
  ],
  "unevaluatedProperties": false
}

With instance

{
  "foo": 42,
  "bar": "hello"
}

Under the current interpretation, both schemas assert that "foo" and "bar" are unevaluated. The successful evaluation of "bar" is dropped because the properties keyword fails and doesn't produce annotations. Under the new interpretation, the first schema will assert that just "foo" is unevaluated an the second schema will assert that both "foo" and "bar" are unevaluated. In the first schema, unevaluatedProperties can see properties's annotations because they're in the same schema. In the second schema, unevaluatedProperties can't see properties's annotations because allOf won't forward annotations from it's failed schemas.

@karenetheridge

there is one change in behaviour, at unevaluatedItems.json

I don't see why this behavior should change with the new interpretation. The contains annotation should be the indexes of the schemas that successfully validate against the schema. Index 2 validates against the contains, index 0 validates against prefixItems, and index 1 is unevaluated. Assuming I'm missing something and the new interpretation does result in that test evaluating to false, doesn't that show that the new interpretation is inconsistent with the spec for unevaluatedItems? The spec is very clear that "unevaluated" means not-successfully-validated.

@Relequestual
From my understanding so far, the new interpretation is at best problematic. If the identified behavior changes are correct, it's also contradictory to the definitions of unevaluatedProperties and unevaluatedItems. Unless I'm missing something, it appears that the new interpretation is not what was intended.

@karenetheridge
Copy link
Member

With the new interpretation, all keywords produce annotations (the ones that produce them, anyway), whether asserting true or false, however a schema (specifically objects schema because boolean schemas don't annotate) drop annotations on validation failure. So the (sub)schema that contains properties wouldn't pass up annotations, but properties still does.

Yes. This actually let me remove a fair bit of code (the second commit referenced in my diff above) because applicator keywords no longer need to accumulate their subschemas' annotations in a separate list (previously they would keep the new annotations separate, until after all subschemas were evaluated and we'd know if the keyword was going to produce a valid or invalid result). Now we only need to maintain separate annotation accrual at the overall subschema evaluation level - after all keywords in that subschema have evaluated, then we decide whether to throw those new annotations away or forward them on to the previous level of evaluation recursion.

@karenetheridge
Copy link
Member

karenetheridge commented Feb 27, 2022

Why? The spec is very clear that "evaluated" means "successfully evaluated".

The important difference in interpretation is whether an invalid keyword allows adjacent keywords to see the annotations produced by its subschemas or not. If annotations are not wiped out until after all adjacent keywords have been evaluated, then those annotations can be used by other keywords when considering whether to evaluate at a certain property/item or not.

I just realized there will still be a difference in "do we evaluate at this property or not" in the case of boolean schemas -- draft7 additionalProperties would not evaluate a property foo in the presence of "properties": { "foo": false } but in draft2020-12 it would, if we only used annotations as the deciding factor. So PR #1154 is still relevant to resolve that ambiguity there.

there is one change in behaviour, at unevaluatedItems.json

I don't see why this behavior should change with the new interpretation.

It depends on whether we are just changing the way annotations are preserved within subschemas, or also changing the interpretation of "evaluated". I made the "evaluated" change first, and then the annotation-preservation change second; given the second change, the first change is rendered somewhat redundant, except for contains, and in the case of boolean schemas I described above.

Let's look more closely at the test case... The schema is:

        {
            "prefixItems": [true],
            "contains": {"type": "string"},
            "unevaluatedItems": false
        },

The data is: [ 1, 2, "foo" ]

  • prefixItems evaluates at item 0, producing an annotation of [ 0 ] and a result of true.
  • contains evaluates against all items, with evaluation results of [ false, false, true ] which results in an annotation of [ 2 ] (the indexes of all successful evaluations) and an overall result of true.
  • previously (under the popular interpretation), unevaluatedItems would only evaluate at item 1 (items 0 and 2 were already evaluated, as per the annotations produced by prefixItems and contains), and evaluate to false, rendering an overall result at this schema of false.

Now, under an interpretation of "evaluated means a subschema was applied at this location, no matter whether it returned a valid or invalid result", unevaluatedItems would not evaluate at index 1, because contains already evaluated there, producing an overall result of true. However, this means that unevaluatedItems never has an effect when used in combination with contains (whether adjacent to it in the same schema, or in another schema elsewhere that is applied to the same data location). I can see arguments both ways as to whether this is useful or not:

If we only alter our interpretation to allow for how annotations are handled in subschemas, and do not change our interpretation of "evaluated means evaluated even if not successfully", then:

  • the contains test above doesn't change (and indeed no tests in the test suite change, because no overall true/false result changes and only the number of errors produced on invalid results will change, and we don't have tests for that in the test suite yet),
  • the problem I described in resolve contradictions in the described behaviour of "additionalProperties" and "items" #1154 is still relevant, as there was still a change in how applicator keywords generated errors between draft7->draft2019-09, as it's relevant whether annotations are used in the implementation or we just look at the existence of adjacent keywords.

On the other hand, if we also alter our interpretation of evaluated to mean "evaluated means evaluated even if not successfully", then the changes I propose in #1154 are moot, and also this one contains test will change behaviour.

I'm not yet sure which one of these I prefer. I believe that option #1 could be done now, in the interim draft revision, but #2 is a bigger change and would might a new draft, as it produces actual changes in overall validation results (as shown in the contains example above, although I don't think this extends to any other keywords). I lean towards wanting the #2 change as well, for the reasons I describe in #1154 (all the unevaluated\* errors are redundant and generally unhelpful, and when there are a lot of them it clutters the results to the point of obfuscation and confusion).

@jdesrosiers
Copy link
Member

It depends on whether we are just changing the way annotations are preserved within subschemas, or also changing the interpretation of "evaluated".

A far as I can tell, @Relequestual is only suggesting that the spec says something different about annotation collection than we thought it did. I don't think he's also suggesting that we misinterpreted what "evaluated" means. Given the wording in the spec, I think the former is a reasonable question to ask, but the latter is clearly wrong.

I think the last example I presented clearly shows that the new interpretation is problematic and not what was intended. I think that for the patch release it's reasonable to change the spec to reflect the intention and what everyone interpreted it to mean.

If we want to change what "evaluated" means, we can do that in the next full release, but we need to decide what to do for the patch release first.

@gregsdennis
Copy link
Member

gregsdennis commented Feb 27, 2022

Now, under an interpretation of "evaluated means a subschema was applied at this location, no matter whether it returned a valid or invalid result", unevaluatedItems would not evaluate at index 1, because contains already evaluated there, producing an overall result of true. - @karenetheridge

This isn't quite right.

contains says

This keyword produces an annotation value which is an array of the indexes to which this keyword validates successfully when applying its subschema

contains explicitly produces annotations only for successful validations, and so it would not produce a [1] annotation and unevaluatedItems would catch that item.

properties, on the other hand, says

The annotation result of this keyword is the set of instance property names matched by this keyword.

which could be interpreted to mean "the properties it declares." With this, unevaluatedProperties would skip over all properties listed in in properties whether their subschemas passed or failed, which is what users seem to expect.

The spec is very clear that "evaluated" means "successfully evaluated". - @jdesrosiers

Where is this? I can't find it. There are, however, many places where the spec explicitly states "successfully evaluated" or "evaluated successfully," implying that "evaluated" just means "looked at." Moreover, these phrases are concentrated around the section for the unevaluated* keywords.

I think that we're good with this new interpretation without any changes. If anything, I'd just want further clarification around stopping annotations at the schema boundary and retaining all annotations within the context of a subschema (regardless of pass/fail of a particular keyword) until that subschema has completed evaluation.

@jdesrosiers
Copy link
Member

@gregsdennis

There are, however, many places where the spec explicitly states "successfully evaluated" or "evaluated successfully," implying that "evaluated" just means "looked at."

I don't know how you come to this conclusion. If "evaluated" just meant "looked at", why would it need to say "successfully"? "Evaluate" is used as a more generic term for "validate". I can say that with authority because I came up with that word. So, you should read it as "successfully evaluated" and "validated successfully".

I'm surprised this point is controversial. It was discussed in depth. It was a counterintuitive concept, but Henry successfully convinced almost everyone that that was how unevaluated* should be defined.

@gregsdennis
Copy link
Member

gregsdennis commented Feb 28, 2022

That may be, but it doesn't change that the spec is full of places where it explicitly states successful or unsuccessful evaluation/validation. Reading the spec, it's not defined with a successful outcome as the default.

If "evaluated" just meant "looked at", why would it need to say "successfully"?

Because if that was what "evaluated' meant, the spec would need to specify the result, which it does. If "evaluated" means "successfully validated," then "evaluated successfully" is redundant.

Regardless, I think this is tangential given the definitions I listed above.

@gregsdennis
Copy link
Member

gregsdennis commented Mar 3, 2022

I updated my library (still haven't published yet) to always add annotations at the keyword level, whether the keyword itself passes or fails. It made no difference to the overall outcome: I still passed the test suite.

It did, however, have an impact on the set of errors I returned. additional* and unevaluated* no longer give extra errors in the direct case (with out the $ref) from @karenetheridge's comment.

This is the error set that my users expect to see.

The superfluous unevaluatedProperties error is still present for the schema that defines a property within a $ref, as predicted.

@karenetheridge
Copy link
Member

karenetheridge commented Mar 14, 2022

On the call, @jdesrosiers suggested the minimum we could do right now for the patch release is to acknowledge and accept that two approaches and results are allowable.

I'm a bit confused by this, because I thought in this thread that we brought clarity to what is being said in the specification (and more than one of us in this thread were wrong with their original understanding of what it said).

Specifically, what we found is this:

  • I (ether) originally thought that there was a discrepancy in behaviour if one implemented additionalProperties in one of twi dfiferent ways: using annotations (similarly to unevaluatedProperties) or by simply looking at sibling keywords and re-applying the same matching rules for properties and patternProperties. I wrote this issue up to propose a way of resolving this discrepancy.

  • Through discussion, we discovered that these two different methods of implementation are in fact identical, in both valid/invalid behaviour and annotation behaviour, given the reading that 1. properties, additionalProperties (etc) produce annotations on all properties whose subschemas were evaluated (regardless of valid/invalid results), and 2. applicator keywords always propagate their subschemas' annotations to sibling keywords, even if the applicator keyword itself produced a false validation result. (e.g. a properties keyword with multiple matching properties, some of which produced valid results with annotations and one of which produced an invalid result with an error: the properties keyword itself will generate an error "not all properties matched", but the errors and annotations from subschemas will continue to be visible for the duration of the containing schema's evaluation.

Therefore the original premise for this (#1172) issue is moot, and we can either do nothing, or insert some clarifying wording in the documentation regarding when annotations are propagated/not propagated to containing schemas, and specifically pointing out that applicator keywords can produce/propagate both errors and annotations at the same time.

@jdesrosiers
Copy link
Member

I'm wondering if the entire concept of annotations should be revised to something different

This is the direction I'd like to go. In which case, none of this matters.

@gregsdennis
Copy link
Member

I'm happy to do that, too, but not in a patch to this version of the spec. Let's open a new issue to discuss that for the next one.

@Relequestual
Copy link
Member

On the call, @jdesrosiers suggested the minimum we could do right now for the patch release is to acknowledge and accept that two approaches and results are allowable.

I'm a bit confused by this, because I thought in this thread that we brought clarity to what is being said in the specification (and more than one of us in this thread were wrong with their original understanding of what it said).
...
Therefore the original premise for this (#1172) issue is moot, and we can either do nothing, or insert some clarifying wording in the documentation regarding when annotations are propagated/not propagated to containing schemas, and specifically pointing out that applicator keywords can produce/propagate both errors and annotations at the same time.
- @karenetheridge

I mostly agree.

I suggested an alternate interpritation to understanding annotation propagation. However I'm not convinced that if we DO take that interpritation, that we won't be inadvertintly creating problems elsewhere.

We can ask at least two implementations (@karenetheridge and @gregsdennis 's) to check in terms of validaiton. We can also ask to confirm if there are implications for the annotation and output format results (aka, does running the test suite result in different output based on the changed understanding).

But, what concerns me is that we don't have an output test suite. It's possible there are changes in annotations and output that we wouldn't see by running the validaiton test suite alone. I'm caucious about allowing through a change when I don't feel we can truly assess the impact.

Given we plan to re-evaluate the annotations process as a whole for the next version, it feels a little counter producitve to spend cycles to allow us to confidently make a change which we plan to make moot soon after.

...we can either do nothing

Indeed. Nothing in terms of the spec, but we must at a minimum add a CREF acknowledging the problem.

...or insert some clarifying wording in the documentation regarding when annotations are propagated/not propagated to containing schemas

...If we choose to accept the new interpritation of annotation propogation. As I said, I don't feel we can fully evaluate the impact. (I'd love to be convinced I'm wrong about this).

and specifically pointing out that applicator keywords can produce/propagate both errors and annotations at the same time

Again, if we choose to accept the new interpritation.

As earlier...

...The minimum we could do right now for the patch release is to acknowledge and accept that two approaches and results are allowable - @Relequestual quoting @jdesrosiers

We can do this within a CREF. Doing so would explicitly not change the specification. You, and others, would be free to make an interpritation of your choosing regards to annotation propagation because the spec is lacking (Hence why we're still on this issue 😅).


As an aside: Looking at items, the reading supports the new interpritation of "evaluated". I don't know how we got so lost on this.

If the "items" subschema is applied to any positions within the instance array, it produces an annotation result of boolean true, indicating that all remaining array elements have been evaluated against this keyword's subschema. - https://json-schema.org/draft/2020-12/json-schema-core.html#rfc.section.10.3.1.2

"...subschema is applied... produces an annotation result... indicating all items have been evaluated against items"
The more I re-read the spec, the more I'm sure this was the intent, and we somehow overloaded "evaluated" to mean successfully.

My concern remains though, that as we missed the intent, we may have allowed changes which are impacted by the different interpritations.

@Relequestual
Copy link
Member

After discussions on Slack...

I'm happy with Jason's suggestion to revisit annotations, but in the next version. I don't think we'll reach consensus on editing the current version, so the only option is to leave as is and allow implementors to follow whichever interpretation they choose. - @gregsdennis

I agree, we were not reading the spec carefully enough before. I can live with just adding a CREF I guess - @karenetheridge

As such, I'll move forward with a PR to do just that (adding a CREF) looking to resolve this issue.

Relequestual added a commit to Relequestual/json-schema-spec that referenced this issue Mar 30, 2022
Fixes json-schema-org#1172
Must see new issue relating to the behaviour of annotation collection for resolving in the next draft.
@gregsdennis
Copy link
Member

We can ask at least two implementations (@karenetheridge and @gregsdennis 's) to check in terms of validaiton. We can also ask to confirm if there are implications for the annotation and output format results (aka, does running the test suite result in different output based on the changed understanding).

JsonSchema.Net (testable on https://json-everything.net/json-schema) has been updated to use the new interpretation.

I had four separate issues opened with people requesting this adjustment, but I had declined saying that keywords that failed validation don't pass along annotations. This reinterpretation has allowed me to fulfill their requests.

Relequestual added a commit to Relequestual/json-schema-spec that referenced this issue Apr 14, 2022
Fixes json-schema-org#1172
Must see new issue relating to the behaviour of annotation collection for resolving in the next draft.
Relequestual added a commit that referenced this issue Apr 14, 2022
Fixes #1172
Must see new issue relating to the behaviour of annotation collection for resolving in the next draft.
@Relequestual
Copy link
Member

Given that:

  • This issue has resulted in a PR merged which answers the question
  • We accept more review of the annotation system is needed
  • We shouldn't do said review as part of the 2020-12 patch release
  • There's a new issue related to this issue to review annotations for the next release

I'll close this issue as resolved.
New issue for tracking these concerns: #1215
Mentioned PR: #1203

Relequestual added a commit that referenced this issue Jun 16, 2022
Fixes #1172
Must see new issue relating to the behaviour of annotation collection for resolving in the next draft.
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Nov 16, 2023
This cuts down on the number of errors when `UnevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Nov 28, 2023
This cuts down on the number of errors when `UnevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Dec 10, 2023
This cuts down on the number of errors when `unevaluatedProperties`
fails.

Related:
- #157
- json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Dec 11, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Dec 11, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Dec 16, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Dec 16, 2023
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this issue Feb 25, 2024
This uses all adjacent (same schema) results to calculate unevaluated
keys/items, whether the keywords validated successfully or not.
Previously, it only considered valid results, which causes confusing
errors:

```ruby
schemer = JSONSchemer.schema({
  'properties' => {
    'x' => {
      'type' => 'integer'
    }
  },
  'unevaluatedProperties' => false
})
schemer.validate({ 'x' => 'invalid' }).map { _1.fetch_values('schema_pointer', 'error') }
 # =>
 # [["/properties/x", "value at `/x` is not an integer"],
 #  ["/unevaluatedProperties", "value at `/x` does not match schema"]]
```

The overall validation result shouldn't be affected, since the only
additional keywords that it considers are failed ones (meaning the
entire schema fails regardless of the unevaluated keys/items).
Duplicate/unhelpful error messages are reduced, though, which is the
main reason for making this change.

Generally, this interpretation doesn't align with my reading of the
spec, but there's been a lot of [discussion][0] around it and I think it
makes sense from a user experience perspective. Hopefully it will get
clarified in a future draft.

Closes: #157

Related:
- json-schema-org/json-schema-spec#1172
- https://github.com/orgs/json-schema-org/discussions/67
- https://github.com/orgs/json-schema-org/discussions/57
- https://github.com/json-schema-org/json-schema-spec/blob/2cb7c7447f9b795c9940710bf0eda966a92c937f/adr/2022-04-08-cref-for-ambiguity-and-fix-later-gh-spec-issue-1172.md

[0]: json-schema-org/json-schema-spec#1172
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants