Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change guarded string reserved tokens to #", ##", ### #133924

Closed
wants to merge 1 commit into from

Conversation

pitaj
Copy link
Contributor

@pitaj pitaj commented Dec 5, 2024

reserving `##` broke the peg proc macro
@rustbot
Copy link
Collaborator

rustbot commented Dec 5, 2024

r? @cjgillot

rustbot has assigned @cjgillot.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 5, 2024
@traviscross
Copy link
Contributor

@rustbot labels +A-edition-2024 +T-lang +I-lang-nominated +I-lang-easy-decision

After compiler review, please assign this to me. This is a reduction in what we're deciding to reserve, so we'll cover it in a lang meeting, but it should be an easy call.

@rustbot rustbot added A-edition-2024 Area: The 2024 edition I-lang-easy-decision Issue: The decision needed by the team is conjectured to be easy; this does not imply nomination I-lang-nominated Nominated for discussion during a lang team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Dec 5, 2024
@workingjubilee
Copy link
Member

workingjubilee commented Dec 5, 2024

Is there something that makes ##'s use in proc macros different from a hypothetical use of ###, or is this just "we thought ## wasn't used by anyone, and we're pretty sure ### isn't used by anyone"?

@pitaj
Copy link
Contributor Author

pitaj commented Dec 6, 2024

@workingjubilee speaking only for myself, the second thing. I did a search trying to find ## in Rust code on GitHub and didn't find anything.

It does bring up a good point, though: we could go all the way to #### if necessary or desired.

@traviscross
Copy link
Contributor

In particular, we thought that even if it were to be used by someone, that it could be fixed by adding a space at the call site, but here, with peg, the library author specifically checks to ensure that a space is not there.

@workingjubilee
Copy link
Member

workingjubilee commented Dec 6, 2024

My concern is more (heh) procedural: I am not certain if any sequence of tokens is significantly different from another when considering the fact that a proc macro1 can use that specific joined sequence as syntax. You can peer into Rust source strings if you want and implement logic that conditions on that within a proc macro. And more! You have all the magical powers of an impure, imperative language at your disposal.

The input of function proc macros is not required to parse as Rust, only tokenize. So, accepting this PR kinda implies a precedent of extending our stability/migration policies to arbitrary syntax extensions to Rust. Meanwhile, the very purpose of editions is to allow some things to simply break, without migration:

Editions are used to introduce changes into the language that would otherwise have the potential to break existing code, such as the introduction of a new keyword.

So, the correct decision may be to simply allow the "break", then think a bit harder in the future about how to do "API evolution" for everyone's microcompilers. Instead of this PR as it stands, we would want a tweak of the rustfix and diagnostics... I think it should be possible to recognize we're in a span that is input to a function macro? Then maybe we would consider an RFC or something on a policy concerning stability expectations for proc macros2?

Or not! Like, the vibe could go either way. This is a fairly harmless change in the reservation to make. I care more about bringing attention to "this decision seems like it has different premise from most stability/migration-related decisions" than what specific decision is made here: y'all can flip a coin if ya want.

Footnotes

  1. I believe the way this works is entirely different for decl macros where "just insert a space" would in fact work, normally?

  2. I think right now we don't even promise a specific expansion order for macros on a single item[^3], which does have real impacts on reasoning about real-world (highly unsafe!) code, like in zerocopy and bytemuck.

@Noratrieb
Copy link
Member

Noratrieb commented Dec 6, 2024

The input of function proc macros is not required to parse as Rust, only tokenize. So, accepting this PR kinda implies a precedent of extending our stability/migration policies to arbitrary syntax extensions to Rust.

The problem here is not the addition of new syntax, but rather the removal of syntax (by reserving it). ## lexed successfully as one thing before, but now no longer does.
I agree that it might be best to let it break (but am definitely deferring final judgement to y'all who have more stakes in this), but I don't think there is that much precedent here other than for "break valid lexes", which are only a tiny case of syntax extensions.

@cjgillot cjgillot assigned workingjubilee and unassigned cjgillot Dec 7, 2024
@workingjubilee
Copy link
Member

r? compiler

@rustbot rustbot assigned estebank and unassigned workingjubilee Dec 8, 2024
Copy link
Contributor

@estebank estebank left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this need to be backported in order to make it to the 2024 ed release? I'm concerned about making these kind of changes to a brittle part of the lexer in a hurry :(

I'm not seeing any reasons this won't work as expected, but I'm feeling some unspecified unease that I can't put my finger on. It could be anything from my subconscious waiting for my brain to catch up, to the cold I'm recovering from, so I can't provide good feedback on an alternative.

@@ -415,8 +415,16 @@ impl Cursor<'_> {
}

// Guarded string literal prefix: `#"` or `##`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Will do

Comment on lines +418 to +421
'#' if matches!(
(self.first(), self.second()),
// #" ##" ###
('"', _) | ('#', '"') | ('#', '#')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Miiiildly concerned about the implicit interaction between parser and lexer with these now, given the new reserved syntax, but that goes beyond this PR and is more about the feature as a whole, and I'm not going to block that.

Comment on lines -807 to -808
debug_assert!(self.prev() != '#');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be replaced with a check of str_before in maybe_report_guarded_str?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this check should exist at all, and I'm surprised it hasn't blown up yet. It will blow up on #### if running in debug.

@traviscross
Copy link
Contributor

Will this need to be backported in order to make it to the 2024 ed release?

No; not right now at least. The edition will ship with Rust 1.85 which branches 2025-01-03.

@traviscross
Copy link
Contributor

traviscross commented Dec 11, 2024

We talked about this in the lang call today. Our inclination was to keep the reservation for ## for reasons roughly along the lines of what @workingjubilee outlined here. The purpose of editions is to make changes that would otherwise be breaking, and proc macros present a kind of infinite surface area for doing things that will present special migration challenges.

Niko raised an interesting idea, which is that similar to what we do with r#, we could imagine having some (ugly) syntax for "just put this exact token sequence into the source code". Then we could use that in migrations in a similar way to how we use r#. That's of course not a solution for this edition, but it's something we may keep in mind.

We're of course influenced here also by the fact that this feature does not appear to be documented or widely used.

All that said, we wanted to touch base with the author of the peg crate, @kevinmehall, to see whether there's anything more we should know here, and we'll check back in on this next week.

See:

@kevinmehall
Copy link
Contributor

As I explain in more detail on the rust-peg issue, the peg ## syntax was intended to be temporary and was never documented. I want to replace it anyway, and if this only breaks when the proc-macro is invoked from an edition 2024 crate, I don't think it's a big deal to make anyone who may be using it update peg and switch to the new syntax when updating their Rust edition.

Definitely doesn't seem worth the effort adding workarounds for this if peg is the only proc-macro using the ## syntax, but I appreciate how thorough you all are in considering breaking changes even across an edition boundary here!

@traviscross traviscross removed the I-lang-easy-decision Issue: The decision needed by the team is conjectured to be easy; this does not imply nomination label Dec 12, 2024
@bors
Copy link
Contributor

bors commented Dec 17, 2024

☔ The latest upstream changes (presumably #134414) made this pull request unmergeable. Please resolve the merge conflicts.

@joshtriplett
Copy link
Member

@kevinmehall Thank you very much for your response!

We try (to some degree) to think about macro breaking changes across edition boundaries, because one of the most complex parts of the edition mechanism is the ability to apply macros from one crate to code from another crate that may be using a different edition. As observed here, it isn't guaranteed (because proc macros in particular can do arbitrary things), but we try to watch for cases like this and coordinate.

@joshtriplett
Copy link
Member

Closing this based on the feedback suggesting that we don't need to worry about peg here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-edition-2024 Area: The 2024 edition I-lang-nominated Nominated for discussion during a lang team meeting. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2024 reserved ## breaks some proc-macros
10 participants