-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate template pre-processing #142
Comments
Paging known hyper-schema users @jdesrosiers , @slurmulon, @Anthropic :-) |
Forgot to add: this approach would apply to any current template keyword, so there would be a |
@awwright : Note that if jumping straight to this approach is too much for folks, it's entirely possible to support both templating and this mapping approach. We just declare that template processing happens first, and then the mapping rules apply (which only makes any difference if I personally have no intention of implementing template processing in my tools, because it's extremely complicated for very little benefit. But if the main concern is not changing things too quickly or drastically, that would be an option. |
Updating with thoughts from an IRC discussion with @awwright :
|
Relative pointer alternative: Dotted URI Template variables@awwright has expressed doubts about Relative JSON Pointer, and indicated a preference for using dot-separated URI Template variable names to reference nested properties and array elements. Here are my thoughts- warning, I'm not even pretending to be objective here :-) While I'm not specifically attached to Relative JSON Pointers, I don't see how dot-separated variables can come close to their functionality. The URI Template specification RFC 6570 defines variables as: varname = varchar *( ["."] varchar )
varchar = ALPHA / DIGIT / "_" / pct-encoded Pros of dot-separated template variables
Cons of dot-separated template variables
To me, the last two are deal-breakers. It's not clear to me how dotted variables could be defined to handle those- it would likely have to involve more preprocessing of the kind we are trying to move away from. Or it would require an additional separate solution, and I haven't (yet) heard a good reason why we should have two approaches to complex instance data referencing when there is a very simple and consistent approach that has been successfully implemented "in the wild". |
@handrews we get away for the most part with dot and bracket as we process any key internally into an array based key anyway. objectpath lib can transform any a.b[c].d into the key ['a', 'b', c, 'd'] and back to a['b'][c]['d'] for use in js. Potentially some form of functional symbolism may be easier to work with, eg. "parent(3).x[path(parent(3).y.ref.id)].z", for processing with regexes into something interpretable for each language while still offering simplicity for simple keys eg. "parent(2).array[3].property". With parent/self only accepted as the first element in a path and path can identify a key for an index using another path. I don't have strong opinion on the end format, so long as it is easy to translate and covers every possible scenario in the simplest way possible ;) |
@Anthropic While I can see building a useful syntax in that way, not all languages will be able to handle it as easily as JavaScript. Which means that it adds a relatively complex parsing requirement to the system. Additionally, we end up with yet more preprocessing and escaping needs- those strings would not be valid variables in URI Templates as-is. So we're back to layering a macro language on top of URI Templates. Ick. I also really like that, since we're using JSON Pointer elsewhere, relative pointers are easily applied to JSON Pointers to produce new JSON Pointers. JSON Schema already uses JSON Pointer, so implementations already need to know how to apply JSON Pointers to instances. This means that using a relative pointer is extremely simple:
Applying the relative pointer to an existing JSON Pointer is intentionally a very simple process both in terms of parsing and executing. You only need to parse and make use of the leading term (before the first "/") of the relative pointer, after which you can just blindly tack the rest of the relative pointer onto the intermediate result. Relative pointers are also visually consistent with JSON Pointers that appear elsewhere such as in "$ref" URIs. Finally, other proposals like So far no one has explained why relative pointers are bad. In particular, why they are worse than implementing our own mini-language within URI Templates. If the dot notation alone worked for us (as it does when used with some data type that does not have containing/sibling data) I would be all for it. But it does not. |
@handrews I agree with you by the way, just considering all options. Well, I don't agree it would be significantly harder to parse in different languages, but I certainly agree that mappings are easier to handle than inline anyway. I would go so far as to say json-pointers themselves need the same mapping ability... {
"title": "My name is {name}",
"$substitute": { "name": "/model/name" }
} |
By "harder" I mean I would find it more irritating ;-)
Interesting. I think that could get very complicated very quickly, but on the other hand I'm fairly sure something at least kind of like that was proposed somewhere. Can't find it now. I'm not sure I follow about json-pointers themselves needing the same mapping ability, though. |
@handrews I could use substitution to assign indexes dynamically to array pointers "obj/array/{i1}/arr/{i2}/property" etc... |
So this would basically be like "$data" except instead of loading the instance value from the pointer into the schema to replace the "$data" object, it replaces the substitution variables wherever they appear in the schema? Interesting, although I see all sorts of complications with scoping and references. If you want to pursue "$substitute", please file it as a separate issue. I don't think we need to solve it in order to resolve this issue. |
Well it depends, if it was worth considering elsewhere then it would impact on the use of {
"links": [{
"rel": "item",
"href": "/{foo}/{x}/{z}",
"$subst": {
"foo": "1/foo",
"z": "0/y/z",
"x": "3/x"
}
}]
} |
NOTE: I'm splitting this out from #52 in order to get more attention on this specific point and the related pull request #129 . I'll probably split other stuff out from #52 and then close it. In hindsight, it was not a good idea to dump everything together there.
URI Template pre-processing is confusing, significantly complicates implementation, and does not address all limitations in the current rules for filling out templates with instance data.
The current approach tries to circumvent the URI Template spec's variable name limitations within the actual string used to express the URI Template. A simpler yet more powerful approach is to make the URI Template strings normal URI Templates, and use a mapping object to translate legal URI Template names to expressions that can identify any part of the instance. The proposed keyword for the map is
hrefVars
.Relative JSON Pointers come closest to meeting the necessary requirements: starting from any point in the instance (specifically, the point from which the LDO including the template is defined), they can identify nearly any other point in the instance (see #115 for limitations with respect to arrays). See #126 for a discussion of whether Relative JSON Pointers should be a separate I-D or should start as part of JSON Schema. Either would work fine for this proposal.
To preserve the current behavior when preprocessing is not needed, if a template variable "x" does not appear in
hrefVars
, it may be considered present with a relative pointer of "0/x".Examples based on current pre-processing features
Here is a subset of the table of examples for pre-processing, followed by a schema showing links using these variable names.
Note that making use of $ in the "self" case with the "bar" link requires using the URI Template "+" operator to allow percent-encoded sequences, as the $ is replaced by a percent-encoded sequence during pre-processing. This is particularly confusing since, without pre-processing, the "+" would make the "$" a literal dollar sign that did not need to be percent-encoded.
Given this instance:
the "foo" link would expand to "/x/y/z"
Given an instance of
[1, 2, 3, 4]
the "bar" link would expand to "/1/2/3/4" (the "*" suffix is a URI Template "explode" operator which interprets each list element as a path component).Here is what a mapping approach might look like, which would produce the same results when applied to the same two example instances.
The
hrefVars
keyword defines the map forhref
. Note that we no longer need the "+" operator at all. URI Template operators such as the "*" suffix are not part of the template variable name and therefore do not appear in the map.Examples that are impossible with current pre-processing
Given this schema:
Applying it to this instance:
would produce an item link of "/oof/42/true" for the first array element, and "/oof/0/false" for the second. Note that since "x" was not in
hrefVars
it is treated as if mapped to "0/x", which produces the same behavior as the current specification for non-preprocessed variables.This demonstrates the support for referencing data in enclosing and nested instances.
In conclusion
Mapping is much easier to implement and much easier to read than pre-processing. As proposed, it offers a superset of the current functionality. The main implementation concern would be introducing Relative JSON Pointers, but they are required for several proposals currently under consideration.
In my experience on past projects, schema authors found variable mapping easy to work with. It was not a significant source of confusion or bugs, either in schemas or in the code that processed the schemas using this feature.
The text was updated successfully, but these errors were encountered: