-
Notifications
You must be signed in to change notification settings - Fork 120
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add decimal place constraint to number fields #641
Comments
@ezwelty thanks for reporting and if i understand this you are trying to constrain raw data structure or resulting number? |
@rufuspollock Fundamentally, the number as it is stored in the text file. Once it is read in, the concept of decimal places may be lost (for example, "1.10" could become 1.1 with no knowledge that it was parsed from "1.10"). The original dataset designers wanted to ensure that even if data (e.g. GPS coordinates) were submitted by contributors with absurd numbers of decimal places, they were published rounded to a reasonable number of decimal places. |
OK, clear you want parse constraints before casting. Hmmm that seems like it would need something new right ... do you have a suggestion on this that is generic? |
The simplest I can think of is to allow the So for example, percentages with up to one decimal place (e.g. "95.2%" and "95%"): {
"type": "number",
"decimalChar": ".",
"bareNumber": false,
"constraints": {
"minimum": 0
}
} could be more specifically constrained by: {
"type": "number",
"constraints": {
"pattern": "^[0-9]+(\\.[0-9]{1})?%$"
}
} and integer geopoints in the eastern hemisphere stored as a string parsable as a JSON array (e.g. "[90, 45]"): {
"type": "geopoint",
"format": "array"
} could be more specifically constrained by: {
"type": "geopoint",
"format": "array",
"constraints": {
"pattern": "^\\[[0-9]{1,3}, \\-?[0-9]{1,2}\\]$"
}
} The one caveat I can think of – the one brought up by @pwalsh in #428 – is how to deal with JSON data stored as JSON. I presume that JSON values, arrays, and objects could be either read as raw strings or converted back to strings for pattern testing? JSON objects quickly get unwieldy for pattern testing, but it can still be done. For example, integer geopoints in the eastern hemisphere stored as a JSON object: {
"type": "geopoint",
"format": "object",
"constraints": {
"pattern": "^\\{\"lon\":\\s*[[0-9]{1,3},\\s*\"lat\":\\s*\\-?[0-9]{1,2}\\}$"
}
} |
As discussed in #879 I think this is important but using regex on numbers seems very wrong. |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
I am rewriting and publishing an existing dataset as a Data Package (https://gitlab.com/ezwelty/glathida), and it includes decimal place limits on several numeric fields. For now, I have enforced this using a
pattern
constraint:Unfortunately, this practice violates the schema, which currently insists that
pattern
only apply to post-cast values ofstring
fields (#428). I understand the complexity that is avoided by the decision, but also regret the huge potential for specificity that is lost. I wishpattern
was applied to the field values as stored in the text file (csv, json, or otherwise).Otherwise, I see no other option than adding a specific decimal place constraint for
number
fields.The text was updated successfully, but these errors were encountered: