-
Notifications
You must be signed in to change notification settings - Fork 277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JSONSelection] Track location information for parsed structures using Ranged<T>
trait and WithRange<T>
wrapper
#5987
[JSONSelection] Track location information for parsed structures using Ranged<T>
trait and WithRange<T>
wrapper
#5987
Conversation
CI performance tests
|
This comment was marked as resolved.
This comment was marked as resolved.
18dffb2
to
ce36e85
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i didn't go over this with a fine-toothed comb, but the overall structure looks sound and the tests look good. i'm inclined to merge this early, probably right after our next release, so we can iterate on diagnostics for the language server. thank you for this!
apollo-federation/src/sources/connect/json_selection/known_var.rs
Outdated
Show resolved
Hide resolved
13d41b3
to
8a35874
Compare
This comment was marked as resolved.
This comment was marked as resolved.
351f0a6
to
845e2c7
Compare
845e2c7
to
b57decb
Compare
Now that not every T::parse method returns an IResult<Span, Parsed<T>>, it makes sense to name the wrapper struct WithRange<T> to reflect why only some AST structures need to be wrapped (that is, they would have difficulty storing or otherwise deriving their own ranges).
Note that this commit also changes the hash to include error.range (along with error.message and error.path), which practically guarantees no two ApplyToError objects will ever be deduplicated, so we might not even need to implement Hash if we move away from using IndexSet to accumulate the errors. #5987 (comment)
…ath. This method isn't actually used anywhere yet, but changing the return type seems like a good idea: #5987 (comment)
Previously, the parser would always consume both leading and trailing spaces and comments when parsing (most) AST structures. This was a convenient convention because it meant the caller of a given ::parse method typically did not have to worry about consuming trailing spaces immediately after calling the child ::parse method, since they would already have been consumed. Now that we are tracking location information with the help of nom_locate, it's relatively easy to determine the start offset of a given AST structure (typically input.location_offset()), but it's trickier to determine the end offset, because the input text represents the entire remaining suffix of the original input, with no helpful indication where the end of the current AST node might be. If we shift our conventions to stop consuming trailing spaces and comments, then the caller of a given ::parse method can rely on the end offset returned by the child ::parse method to be the end of the child AST node, and can typically use that information to determine the end offset of the parent AST node, with help from merge_ranges. This change also allows us to store useful range information for the PathList::Empty variant, which previously always had `range: None`. Even though these ranges are empty (start == end), we can simplify the full_range calculation in PathList::parse_with_depth if we don't have to check whether matches!(rest.node(), PathList::Empty) and handle that case specially. Instead, we can typically compute full_range from merge_ranges(first_child.range(), rest.range()) without having to fall back to other ranges when rest is PathList::Empty (which previously meant rest.range() would be None). This empty range now captures the location where parsing of the previous AST nodes actually stopped, so rest.range() is always usable (not None) and accurate (no trailing spaces_or_comments included).
1454e59
to
725afd8
Compare
Thanks for working through all of the comments! The only thing left is if you want to add a quick comment about the trailing space. Otherwise, feel free to squash and merge! |
The inclusion of "array" and "string" in the error message felt confusing to me, since it's not obvious that the array/string in question is the input value that came before the `->`. #5987 (comment)
In order to provide more helpful parsing and error messages for JSONSelection, and to enable IDE/LSP features like syntax highlighting and tooltip hovering in the future, the JSONSelection parser needs to track where each parsed substructure came from within the original source string.
The
nom_locate
crate was helpful for replacing raw&str
inputs with thetype Span<'a> = LocatedSpan<&'a str>
type, whichnom
(mostly) seamlessly accepts everywhere&str
was previously expected. However, while theLocatedSpan
type makes it easy to determine the starting offset of a given AST structure, the ending offset typically needs to be derived from parsing the structure's children. To bootstrap this parsing from the bottom up, I found it helpful to add aparsed_span
combinator to match specific concrete tokens and capture their range information.This would all have been considerably easier if every AST structure was a
struct
, because then I could simply have added arange: Option<(usize, usize)>
(akarange: OffsetRange
) field to each struct, and then make sure that field gets populated during parsing. Where possible, that's what I did, but there are a number of AST structures that make more sense asenum
types, and we also need to keep track of location information for primitive/string types that cannot store their own ranges.To work with this variety of AST structures, I introduced a
Ranged<T>
trait, which is implemented by all the structures that can provide (or cheaply derive) their own ranges. For those that cannot, there's aWithRange<T>
wrapper struct that implementsRanged<T>
. I considered usingWithRange<T>
(note: fkaParsed<T>
) for everything, but that proved cumbersome and redundant, so the later commits in this PR walk back the use ofWithRange<T>
to just the types that need it, leaving the rest to implementRanged<T>
directly, without a wrapper.To make this
range
information immediately useful, I am now including it as one of the fields of theApplyToError
struct, making runtimeapply_to_path
errors easier to debug. I would also like to include range information (along with more descriptive error messages) in parse errors generated bynom
, but I haven't found a good way to replace the opaquenom::error::Error
objects (which only store the input string and an error code) with errors that allow custom messages and range information, so that will have to wait for a subsequent PR.You can review this PR commit-by-commit if you like, but I changed my mind a few times (e.g. renaming
Parsed<T>
toWithRange<T>
after addingRange<T>
), so it might be easier to review the whole set of changes together.