-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better support for decimals encoded as strings #45
Comments
@faassen I think this would be best handled as a Note that there is a proposal for |
@faassen I would be happy to add this to the next draft if you (or someone familiar with the issues involved) can either write up a PR or cover exactly what is needed here. The main thing that comes to mind is whether you need to be able to support the maximum digits concept, or if it's sufficient to just state that applications respecting the format SHOULD handle such values as fixed-point numbers, with whatever precision is apparent in the string form. |
This is a use case we have, so here ss sort of a summary of what I'd expect/want. A few things that are hopefully not so controversial:
I come across two broad categories of fixed point. One is common in financial software and some databases and involves storage as strings or as an integer multiplied up by Another type uses a fixed number of bits with the place specified at a given fixed bit position. For example, you may use 16 bit unsigned storage with 3 bits after the point (giving 13 bits before it). This representation is often used in games because the bit shifts are much faster than the divisions needed for working in base 10. Because these are always exact in bicemals they should be safe enough in a numeric field in JSON, at least so long as they fit in the approx 50 bits that work for a double. Both of these can be multiplied up a fixed constant to make them integers, but for various reasons this is not typically attractive for applications to do. Transporting the numbers in strings tends to be well supported. I'll talk to our teams who use json-schema for validation and see what would have helped them for it. |
Since last comment some things have changed. Now Javascript can use other formats as well, but requires non-native number implementation (https://github.com/GoogleChromeLabs/jsbi) or Google Chrome 67+, Opera 54+, Node.js v10.4+ (but Firefox is coming too, as soon as they are able to figure out how https://bugzilla.mozilla.org/show_bug.cgi?id=1366287) |
Many JSON serializers (e.g., C#, Java/Kotlin, Swift, JS, etc.) already serialize their respective decimal types to numbers. They currently always parse numbers as either some form of integer (e.g., Some JSON deserializer already support deserializing their decimal types out-of-the-box. Moreover, there are also several (e.g., C#, Kotlin) that also allow numbers to be "parsed" as strings as part of the deserialization process. Thus making it somewhat trivial to directly deserialize to a base-10/decimal number. For all these reasons, having decimal as a format for number rather than string seems like the better alternative. |
precision and/or scale are not part of the JSON spec, thus different javascript engines will behave differently when working with decimals, some will truncate or round, more can be found here: hence, I suggest that the json-schema spec will use strings, which are more reliable in this case, while also introducing a new dynamic format value:
|
(This issue seems like best place to comment, given closed issues point here) I would expect a way to represent a string that contains an int64 data. For example, "1587627537231057". Example:
I don't know the best way to represent a field that is a string containing int64 data. I'd be interested in proposing this format value if that is required. Related issue: googleapis/google-cloudevents-go#39 |
The preferred way to solve this kind of problem in 2019-09 and later is now extension vocabularies (not |
Representing any numeric value as a string is a hack to get around poor language support. JSON is specifically designed to support arbitrarily large and precise numeric values. There should be no need to put a number in a string. At a previous job, someone had done a language survey of support in this area. The results showed that the vast majority of parsers encounter a number and automatically parse it as an IEEE 64-bit floating point type (e.g. C#'s
This isn't (exactly) right according to our findings. Most parsers go directly to IEEE 64-bit then cast to The correct approach would be to either
To get around this shortcoming, developers started storing numbers as strings so that the original representation wouldn't be lost, and the appropriate numeric type can be parsed when it's needed. However, for reasons explained above, this caused other problems. |
@gregsdennis while it is something of a hack, it is common in areas like finance where exact precision is important and even IEEE can't guarantee you won't end up with weird behavior. It's the only approach that guarantees predictable behavior, sadly. As far as parsers, whatever non-standard behavior some JSON parsers might offer is definitely outside of the scope of JSON Schema. We don't include a decimal type in our data model (and shouldn't, as we need to stick close to JSON's data model). |
JSON Schema is not the problem, but that doesn't mean it couldn't offer a solution. I think what is really happening is that implementations use the JSON string type to embed additional types. What people are asking for is to apply validation directly to that embedded type, not to its string representation. For example:
A possible approach:
Example:
|
A common way to represent decimals in JSON is to serialize them as a string. This side-steps the floating point precision issues during transport and validation, such as mentioned by json-schema-org/json-schema-spec#312. Eventually a deserializer can then transform the string into a language-specific decimal type such as the Python Decimal.
See for instance DecimalField in Django REST Framework
http://www.django-rest-framework.org/api-guide/fields/#decimalfield
or the Marshmallow serialization library:
https://marshmallow.readthedocs.io/en/latest/api_reference.html#marshmallow.fields.Decimal
which includes extra notes on how to handle such precision issues.
This case isn't very well supported by JSON-Schema. multipleOf is insufficient as there are no guarantees around the representation of number in JSON.
You can define a pattern with a regex that restricts string input to decimals, but the implementer needs to create this regex, the error messages aren't very pretty, it's hard to restrict the input by a total amount of digits, and it's not possible to use minimum or maximum.
Should JSON Schema support this use case? Similar to how you have "number" and "integer" dealing with the same underlying JSON type, we could we have a "decimal" type that validates strings specifically and allows the minimum & maximum logic. It would make the implementation of validators more complex in languages that don't have a decimal/fixed point type, but it does seem to be a common use case.
(I myself ran into it when writing code that converts a Django REST Framework serializer DecimalField into a JSON schema representation but it's not really possible to support max_digits or min_value/max_value).
The text was updated successfully, but these errors were encountered: