-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiler loses location information before calling macros (sometimes) #43081
Comments
This partly resolves the `FIXME` located in `src/libproc_macro/lib.rs` when interpreting interpolated tokens. All instances of `ast::Item` which have a list of tokens attached to them now use that list of tokens to losslessly get converted into a `TokenTree` instead of going through stringification and losing span information. cc rust-lang#43081
Implement tokenization for some items in proc_macro This PR is a partial implementation of #43081 targeted towards preserving span information in attribute-like procedural macros. Currently all attribute-like macros will lose span information with the input token stream if it's iterated over due to the inability of the compiler to losslessly tokenize an AST node. This PR takes a strategy of saving off a list of tokens in particular AST nodes to return a lossless tokenized version. There's a few limitations with this PR, however, so the old fallback remains in place.
This commit updates the `Mac_` AST structure to keep track of the delimiters that it originally had for its invocation. This allows us to faithfully pretty-print macro invocations not using parentheses (e.g. `vec![...]`). This in turn helps procedural macros due to rust-lang#43081. Closes rust-lang#50840
rustc: Correctly pretty-print macro delimiters This commit updates the `Mac_` AST structure to keep track of the delimiters that it originally had for its invocation. This allows us to faithfully pretty-print macro invocations not using parentheses (e.g. `vec![...]`). This in turn helps procedural macros due to #43081. Closes #50840
…nd, r=petrochenkov Fix recursive nonterminal expansion during pretty-print/reparse check Makes progress towards rust-lang#43081 In PR rust-lang#73084, we started recursively expanded nonterminals during the pretty-print/reparse check, allowing them to be properly compared against the reparsed tokenstream. Unfortunately, the recursive logic in that PR only handles the case where a nonterminal appears inside a `TokenTree::Delimited`. If a nonterminal appears directly in the expanded tokens of another nonterminal, the inner nonterminal will not be expanded. This PR fixes the recursive expansion of nonterminals, ensuring that they are expanded wherever they occur.
Fixes rust-lang#74616 Makes progress towards rust-lang#43081 Unblocks PR rust-lang#76130 When pretty-printing an AST node, we may insert additional parenthesis to ensure that precedence is properly preserved in code we output. However, the proc macro implementation relies on comparing a pretty-printed AST node to the captured `TokenStream`. Inserting extra parenthesis changes the structure of the reparsed `TokenStream`, making the comparison fail. This PR refactors the AST pretty-printing code to allow skipping the insertion of additional parenthesis. Several freestanding methods are moved to trait methods on `PrintState`, which keep track of an internal `insert_extra_parens` flag. This flag is normally `true`, but we expose a public method which allows pretty-printing a nonterminal with `insert_extra_parens = false`. To avoid changing the public interface of `rustc_ast_pretty`, the freestanding `_to_string` methods are changed to delegate to a newly-crated `State`. The main pretty-printing code is moved to a new `state` module to ensure that it does not accidentally call any of these public helper functions (instead, the internal functions with the same name should be used).
…ochenkov Refactor AST pretty-printing to allow skipping insertion of extra parens Fixes rust-lang#75734 Makes progress towards rust-lang#43081 Unblocks PR rust-lang#76130 When pretty-printing an AST node, we may insert additional parenthesis to ensure that precedence is properly preserved in code we output. However, the proc macro implementation relies on comparing a pretty-printed AST node to the captured `TokenStream`. Inserting extra parenthesis changes the structure of the reparsed `TokenStream`, making the comparison fail. This PR refactors the AST pretty-printing code to allow skipping the insertion of additional parenthesis. Several freestanding methods are moved to trait methods on `PrintState`, which keep track of an internal `insert_extra_parens` flag. This flag is normally `true`, but we expose a public method which allows pretty-printing a nonterminal with `insert_extra_parens = false`. To avoid changing the public interface of `rustc_ast_pretty`, the freestanding `_to_string` methods are changed to delegate to a newly-crated `State`. The main pretty-printing code is moved to a new `state` module to ensure that it does not accidentally call any of these public helper functions (instead, the internal functions with the same name should be used).
some question on this issue:
Thanks all |
@thiolliere: Summary of the current progress:
Once all of the pieces of #76130 are merged, any code that compiles on stable should have all spans available (except for any additional pretty-printer issues that may get discovered). Using proc-macros as inner attributes will still cause spans to be lost, but such usage is nightly-only. |
…k, r=petrochenkov Replace pretty-print/compare/retokenize hack with targeted workarounds Based on rust-lang#78296 cc rust-lang#43081 The 'pretty-print/compare/retokenize' hack is used to try to avoid passing an outdated `TokenStream` to a proc-macro when the underlying AST is modified in some way (e.g. cfg-stripping before derives). Unfortunately, retokenizing throws away spans (including hygiene information), which causes issues of its own. Every improvement to the accuracy of the pretty-print/retokenize comparison has resulted in non-trivial ecosystem breakage due to hygiene changes. In extreme cases, users deliberately wrote unhygienic `macro_rules!` macros (likely because they did not realize that the compiler's behavior was a bug). Additionaly, the comparison between the original and pretty-printed/retoknized token streams comes at a non-trivial runtime cost, as shown by rust-lang#79338 This PR removes the pretty-print/compare/retokenize logic from `nt_to_tokenstream`. We only discard the original `TokenStream` under two circumstances: * Inner attributes are used (detected by examining the AST) * `cfg`/`cfg_attr` processing modifies the AST. This is detected by making the visitor update a flag when it performs a modification, instead of trying to detect the modification after-the-fact. Note that a 'matching' `cfg` (e.g. `#[cfg(not(FALSE)]`) does not actually get removed from the AST, allowing us to preserve the original `TokenStream`. In all other cases, we preserve the original `TokenStream`. This could use a bit of refactoring/renaming - opening for a Crater run. r? `@ghost`
Progress update: The pretty-print/retokenize check was removed in #79472. We now only lose spans when we explicitly remove the original
PR #80689 completely resolves this issue. However, it is quite large, so I'm in the process of splitting it into independent pieces for easier review. |
PR #82608 has been merged! We now preserve spans for all stable uses of macros (e.g. custom derives), and nearly all unstable usages (e.g. inner attrs). If anyone encounters a loss of location information, please open a new issue for the specific problem you're having. |
537: Add workaround for a rustc bug (rust/43081) r=halzy a=toasteater This works around a compiler bug which can sometimes cause E0425 when the `#[methods]` macro is used. The error, when triggered, would show up as `error[E0425]: cannot find value `builder` in this scope` without any span information. This is probably an instance of rust-lang/rust#43081 Co-authored-by: toasteater <48371905+toasteater@users.noreply.github.com>
Updated description
First implemented in #43230 the compiler can now tokenize a few AST nodes when necessary losslessly from the original token stream. Currently, however, this is somewhat buggy:
The "real bug" here is that we haven't actually implemented converting an AST to a token tree. The initial implementation in #43230 was effectively just a heuristic, and not even a great one! As a result we still need the ability basically to losslessly tokenize an AST node back to its original set of tokens.
Some bugs that arise from this are:
#[cfg]
modifies the AST but doesn't invalidate the cache -- Macros 2.0: #[cfg_attr] makes .to_string() and TokenStream disagree #48644macro_rules!
and procedural macros don't play well together -- Inconsistent tokenization of procedural macros with macro_rules #49846Invoking a macro with brackets loses span information -- Procedural macros stringify inputs with macro invocations too aggressively #50840Original Description
There's an associated FIXME in the code right now, and to fix this we'll need to implement tokenization of an AST node. Right now the thinking of how to implement this is to save all
TokenStream
instances adjacent to an AST node, and use that instead of converting back into a token streamcc @dtolnay, @nrc, @jseyfried
The text was updated successfully, but these errors were encountered: