Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v6 validation: propertyNames #70

Closed
epoberezkin opened this issue Oct 3, 2016 · 22 comments
Closed

v6 validation: propertyNames #70

epoberezkin opened this issue Oct 3, 2016 · 22 comments

Comments

@epoberezkin
Copy link
Member

epoberezkin commented Oct 3, 2016

Definition

The value of this keyword is a JSON-schema. The keyword only applies to objects. If the instance is not an object, validation succeeds. For the instance to be valid all property names of the object should be valid according to this schema. Possible variant: only apply it to property names not explicitly mentioned in properties keyword - there are pros and cons for it.

Motivation

Generally the current spec is more suited for the data stored in values with static keys. The only exception to that is patternProperties.

There are two disadvantages of patternProperties:

  1. You cannot define any other restriction apart from, well, pattern
  2. Even if pattern is sufficient but you need to use the same property names pattern in multiple objects (but with different schemas for values, so you can't use $ref), you have to repeat the same pattern over and over again. It is difficult to maintain and to ensure they are the same.

propertyNames addresses both issues. So instead of having:

{
  "patternProperties": {
    "^(\\d{1,3}\\.){3}(\\d{1,3})$": { "$ref": "hostData.json" }
  },
  "additionalProperties": false
}

you can have:

{
  "propertyNames": { "$ref": "#/definitions/host" },
  "additionalProperties": { "$ref": "hostData.json" },
  "definitions": {
    "host": { "pattern": "^(\\d{1,3}\\.){3}(\\d{1,3})$" }
  }
}

or even

{
  "propertyNames": { "$ref": "#/definitions/host" },
  "additionalProperties": { "$ref": "hostData.json" },
  "definitions": {
    "host": {
      "anyOf": [
        { "format": "hostname" },
        { "format": "ipv4" }
      ]
    }
  }
}

You can see how using propertyNames allows to have the schema for property names that is:

  1. more advanced than simple pattern matching
  2. can be reused and updated in a single place.

We use this keyword a lot, it is defined as a custom keyword for Ajv. It seems to be so useful and simple that maybe it's worth considering adding it to the spec.

@erosb
Copy link

erosb commented Oct 3, 2016

I'm a bit confused about this proposal. I don't see the propertyNames keyword used in any of the examples you provided.
But I can see the additionalProperties , which already exists, and means exactly what you propose.

@epoberezkin
Copy link
Member Author

epoberezkin commented Oct 3, 2016

@erosb it's all there. The second and the third examples have this keyword. It's value is the schema { "$ref": "#/definitions/host" } in both cases. additionalProperties validates property values. propertyNames validates property names themselves (the keys as the strings rather than values).

E.g. in this object:

{
  "192.168.0.1": { "interval": "15s" },
  "8.8.8.8": { "interval": "86400s" }
}

propertyNames would validate the strings "192.168.0.1" and "8.8.8.8"; additionalProperties would validate the objects { "interval": "15s" } and { "interval": "86400s" }.

I hope it makes sense.

@erosb
Copy link

erosb commented Oct 4, 2016

Ohh sorry I'm blind.
+1 for this proposal.

@awwright
Copy link
Member

awwright commented Oct 8, 2016

I had a property that did exactly this in one of my software applications, for the purpose of UI generation.

Any property that deals with strings would be acceptable here... "minLength", "maxLength", and maybe even "links" and "readOnly" and similar hyper-schema things.

@handrews
Copy link
Contributor

handrews commented Oct 8, 2016

OK so "propertyNames" validates names, but does not impose any schema on the names that validate against propertyNames? The schemas all still need to be supplied in one of "properties", "patternProperties", or "additionalProperties"?

What if I want entries keyed by IPv4, IPv6, and hostname to have slightly different schemas?

@dmikov
Copy link

dmikov commented Oct 9, 2016

+1
The other disadvantage of pattern properties (and this might be more of an implementation) is that they are evaluated first. So if you have 5 properties and 1 pattern property - unless you specify all 5 as an exclusion the pattern property will catch them and will try to impose it's schema.

@handrews
Copy link
Contributor

handrews commented Oct 9, 2016

@dmikov

The other disadvantage of pattern properties (and this might be more of an implementation) is that they are evaluated first.

This is not quite correct, in that there is no order to the validation against all of the matching schemas. If a property name is both present in properties and matches a regex in patternProperties, that is treated as if those two child schemas were collected in an allOf. Only additionalProperties is ever ignored if the property already matched somewhere else.

@dmikov
Copy link

dmikov commented Oct 9, 2016

@handrews I don't think it's right at all. How you can collect string and object in the allOf? Makes no logical sense. Plus that's not what happening anyway, the validators will be happy if I put the object instead of string. So it's ONLY what pattern property defines.

@handrews
Copy link
Contributor

@dmikov The only point I was tying to make was that the validation is order-independent, just like with an allOf. In other words, I just meant that given a schema like:

{
    "type": "object",
    "properties": {
        "foo": {"type": "number"},
        "fee": {"type": "string", "pattern": "^z"},
    },
    "patternProperties": {
        "^f": {"type": "string", "pattern": "bar$"}
    }
}```

Then if the instance has a property `foo` then validation of foo's value will always fail, because this is _effectively equivalent to_ validating foo's value against the following:

```JSON
{
    "allOf": [
        {"type": "number"},
        {"type": "string", "pattern": "bar$"}
    ]
}

which is an impossible schema because no value can be both a number and a string.
NOTE: I am not saying that that's actually how any implementation can or should do it, just in terms of schema algebra (for lack of a better term), those have the same effect.

On the other hand, if the instance has a property "fee", then it would effectively be the same as validating fee's value against:

{
    "allOf": [
        {"type": "string", "pattern": "^z"},
        {"type": "string", "pattern": "bar$"},
    ]
}

So an instance of {"fee": "z_bar"} would validate, while neither {"fee": "z_other"} nor {"fee": "x_bar"} would validate.

Does that make more sense?

@dmikov
Copy link

dmikov commented Oct 10, 2016

I am aware of how it currently acts (wasn't aware that it's both, but the fact that it didn't use static definition if regex matched was pretty obvious), that's why I put it here as another reason to have propertyNames. Current implementation of dictionary of objects in json schema is just to prone to errors and awkward now. That's what I was saying.

@epoberezkin
Copy link
Member Author

epoberezkin commented Oct 15, 2016

@dmikov properties/patternProperties behaviour is another issue that you've submitted.

What if I want entries keyed by IPv4, IPv6, and hostname to have slightly different schemas?

I thought about it and it's very easy to construct the keyword that would do it. I don't think we should cater for this use case for two reasons:

  • the keyword will be verbose and difficult to reason about
  • it encourages bad data design where you mix values of different "types" in the same object.

Mixing different things is fine when the keys are static, your data is essentially like a db "record". If you treat your object as a map where both keys and values are data it's generally a bad idea to mix something else in the same object. For the same reason, I am not that concerned about properties/patternProperties issue - I don't remember ever using multiple patterns in the same object, leave alone mixing patternProperties with properties. A much better design is to have a separate object with a single pattern (or schema in case of propertyNames) for the keys. But each to its own I guess...

@epoberezkin
Copy link
Member Author

epoberezkin commented Oct 16, 2016

So if you have 5 properties and 1 pattern property - unless you specify all 5 as an exclusion the pattern property will catch them and will try to impose it's schema.

@dmikov The question is why would you have such data structure? It seems a better design to have this variable key as a value for another static property - i.e. have 7 static properties instead of 5 static properties and 1 pattern property. It's easier to both validate and to process.

@Relequestual
Copy link
Member

Relequestual commented Nov 1, 2016

This proposal has my support =]

@handrews
Copy link
Contributor

handrews commented Nov 4, 2016

@epoberezkin while I don't think JSON Schema should be all that opinionated about data design, at this point I'm on board with this proposal as it stands. I've got pull request #111 outstanding that impacts how all of the object validation keywords are described so you might want to wait until that is resolved before doing a pull request for this.

@Relequestual
Copy link
Member

#111 has been merged

@epoberezkin
Copy link
Member Author

epoberezkin commented Nov 8, 2016

If we want to support more complex rules about schemas for keys leading to different data validation schemas we can extend properties keyword to accept array like:

{
  "properties": [
    { "name": { "$ref": "schema1" }, "value": { "$ref": "schema2" } }
  ]
}

It would mean that if the property name is valid according to the schema1, the value of the property must be valid according to schema2.

But I won't push it :)

@Relequestual
Copy link
Member

I've had a few people on irc the last week or so ask how to do this exact thing. I directed them to pattern properties.

Looks like consensus is we should do this. Awaiting RP.

@epoberezkin
Copy link
Member Author

@Relequestual do what :)? propertyNames or extend properties as above? or both?

@handrews
Copy link
Contributor

handrews commented Nov 21, 2016

@epoberezkin not speaking for @Relequestual but since your extension of properties was a last-minute comment and you said "but I won't push it", I'm going to go out on a limb and say he meant "propertyNames". That's the subject of this issue, anyway.

If you want to pursue extending "properties", I suggest a separate issue to keep things clear.

@awwright
Copy link
Member

@epoberezkin @Relequestual et al., Does #172 seem to have the features you're looking for?

@handrews
Copy link
Contributor

handrews commented Jan 5, 2017

@epoberezkin it's been over a month, I'm assuming you would have said something by now if you thought this needed anything more than PR #172. Obviously please re-open if we missed something.

@handrews handrews closed this as completed Jan 5, 2017
@epoberezkin
Copy link
Member Author

@handrews thank you. I have some ideas :) but it's another issue and after draft 6 is published. This is good as is indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants