Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validation: format[Exclusive](Minimum|Maximum) #116

Closed
epoberezkin opened this issue Oct 28, 2016 · 15 comments
Closed

validation: format[Exclusive](Minimum|Maximum) #116

epoberezkin opened this issue Oct 28, 2016 · 15 comments

Comments

@epoberezkin
Copy link
Member

Originally added as wiki: https://github.com/json-schema/json-schema/wiki/formatMinimum-%28v5-proposal%29

Proposed keywords

  • formatMinimum
  • formatMaximum
  • formatExclusiveMinimum
  • formatExclusiveMaximum

Purpose

Currently, numeric values can be constrained using minimum/maximum.

However, non-numeric formatted data cannot have such constraints, even when the format has a clear ordering. These additional keywords would allow for minimum/maximum constraints on such data.

Values

The values of formatMinimum/formatMaximum may be any JSON value. Particular values of format may imply a particular shape of data.

The values of formatExclusiveMinimum/formatExclusiveMaximum are booleans.

Validation

Validation is very similar to minimum/maximum/exclusiveMinimum/exclusiveMaximum. The only differences are:

  • All data types are constrained (not just numbers)
  • Instead of numerical ordering, an alternative ordering is used, selected based on the value of format.
  • Validation of these keywords is completely optional, exactly like format. Support for one of these keywords does not imply partial or complete support for any of the others.

Example

{
    "type": "string",
    "format": "date-time",
    "formatMinimum": "2013-01-01T00:00Z"
}

In this example, the schema specifies that the string should be an ISO 8601 date-time, at some point after New Years 2013 (GMT).

Note that because ISO 8601 date-times can specify a timezone (and timezone differences should be accounted for in comparisons), there is no dictionary-ordering of strings that would correctly enforce this constraint - this constraint can only be expressed in a format-aware way.

Concerns

Using format-specific ordering is deliberately designed to avoid the localisation/"collation" issues that would be presented by string orderings, so we dodge a bullet there.

format values mentioned in the spec should also mention an expected ordering, if applicable.

This still does not provide a canonical way to specify localised dictionary orderings for strings. A format pattern like "text-de" might help, but should we specify that, or leave it up to schema authors/consumers to agree on something?

@handrews
Copy link
Contributor

A date-time-specific version of this was previously filed as #99

@epoberezkin
Copy link
Member Author

@handrews Yes, I also added a comment in #99. This idea seems more generic.

@handrews
Copy link
Contributor

From #99 , we should also consider formatStep.

@epoberezkin
Copy link
Member Author

@handrews or formatMultipleOf to be consistent with numbers.

@handrews handrews changed the title v6 validation: format[Exclusive](Minimum|Maximum) validation: format[Exclusive](Minimum|Maximum) Dec 27, 2016
@handrews handrews modified the milestone: draft-07 (wright-*-02) May 16, 2017
@handrews
Copy link
Contributor

Since exclusiveMinimum and exclusiveMaximum have since become numeric rather than booleans, I assume formatExclusiveMinimum and formatExclusiveMaximum would not also be values (of the same sort as described by the format) rather than booleans.

If we agree on that, given that no one has spoken against this in the past ten months, I think we can move on to a PR introducing all of these (including formatMultipleOf). Now is the time to speak up if anyone forgot about this :-)

@epoberezkin
Copy link
Member Author

@handrews formatMultipleOf for the next release? It complicates formats more than min/max...

@handrews
Copy link
Contributor

@epoberezkin good point. For "date-time" you would need a time-delta type. I'll write up min/max as a PR and file multipleOf for future consideration.

@handrews
Copy link
Contributor

Given the amount of work needed on hyper-schema, I am not sure I will have time to write a PR for this. Anyone who wants to see this in draft-07 is encouraged to submit a PR.

@handrews
Copy link
Contributor

@epoberezkin why do we need new keywords for this at all? Why not require the existing keywords to apply to both numbers and any other type that has a total ordering imposed by a format? Individual formats SHOULD specify their ordering, even if it seems obvious, and SHOULD describe whether and how multipleOf is to be supported.

One thing to consider is what happens if we support numeric formats. For instance, a fixed-point format that applies to numbers, which means that no matter the JSON representation, the in-memory representation should be handled as fixed point. Regular min/max/multOf would behave one way in terms of the JSON representation, but multipleOf would have subtle differences.

It would seem better to me in such a case that format control not only the interpretation of the number itself, but the interpretation of min/max/multOf.

Another possibility is a date that can be represented as either an integer (seconds since the epoch) or a string (ISO format):

{
    "type": ["integer", "string"],
    "format": "date-time",
}

So 1505148455 as an integer and "2017-09-11T16:47:35Z" as a string are equivalent.

min/max/multOf could be done in an RFC 3339 duration format, or as an integer in seconds.

I feel like the fact that the current set of formats only apply to strings has caused us to think about these things as more different than they actually are. It is clear that format is not only intended for strings, or else individual formats would not declare their applicability.

I'm leaning towards punting this out of draft-07 as I think there are some significant alternatives in terms of implementation that need considering.

@epoberezkin
Copy link
Member Author

I need to think about it. I agree that it is better to remove from draft 7 than to rush.

@kenisteward
Copy link

@handrews are there any thoughts on putting this in draft-9? i know there hasn't been talk here in over a year but it turns out this would be a valuable feature.

Would we just do this instead with $volcabularies and the future external/internal data references in https://github.com/json-schema-org/json-schema-spec/issues/549

My current use case is we want to define dateOfBirth's maximum value of 18 years ago and min of 1900 as well as employment start dates of today or in the future but not before today.

Would you say instead today I should just implement this as integer with the number value for min / max or as a string with a pattern or wait for this instead?

@handrews
Copy link
Contributor

@kenisteward I would not wait. It's definitely not going to make it into draft-08, and I don't know exactly what the timeline for draft-09 might be. We may do a small update quickly with small features, but I also plan to re-evaluate a lot of proposals given that $vocabulary and friends make it so much easier to build and share new keywords.

Also, I'm not sure how the concept of "today" would work- that's obviously a useful concept for this, but we haven't addressed relative values ("durations" in RFC 3339 terms) at all yet.

@handrews
Copy link
Contributor

I would now recommend using modular vocabularies to accomplish this.

I am, in fact, hoping that format gets deprecated in favor of a suite of vocabularies (e.g. why does the same keyword handle email addresses and date-time strings? the level of difficulty and related concerns such as ordering are completely different.

I'm going to close this as I feel it no longer fits the current direction of JSON Schema w.r.t. format.

@Vampire
Copy link

Vampire commented Jun 30, 2020

So does the label Status: Available apply here?
As far as I understood this is not going to be implemented, or there is another way.
@handrews can you maybe either link to appropriate doc or give an example here, e. g. how to define a minimum date?

@handrews
Copy link
Contributor

handrews commented Jul 1, 2020

@Vampire no, sorry, forgot to remove that when I closed it.

We have added support for modular, re-usable vocabularies in the most recent draft. As in you can declare that you're using a vocabulary of keywords, and an implementation can load a plugin to support them.

This is the sort of thing we hope to see done in new vocabularies, such as a vocabulary focusing on date-time keywords (we want to dump format because it's a mess, but we won't do so unless the community clearly favors an alternative).

We're encouraging people to use https://github.com/json-schema-org/json-schema-vocabularies/issues as a holding ground for ideas. I think there are a few date-time issues filed there, not sure if this is covered or not. Feel free to file it. We (the JSON Schema project proper) will not be working on those, but we will hopefully be able to help folks looking for ideas to implement find the ideas that are needed by JSON Schema users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants