Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking Issue: Procedural Macro Diagnostics (RFC 1566) #54140

Open
3 of 6 tasks
SergioBenitez opened this issue Sep 11, 2018 · 103 comments
Open
3 of 6 tasks

Tracking Issue: Procedural Macro Diagnostics (RFC 1566) #54140

SergioBenitez opened this issue Sep 11, 2018 · 103 comments
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-lang-nominated Nominated for discussion during a lang team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@SergioBenitez
Copy link
Contributor

SergioBenitez commented Sep 11, 2018

This is a tracking issue for diagnostics for procedural macros spawned off from #38356.

Overview

Current Status

  • Implemented under feature(proc_macro_diagnostic)
  • In use by Rocket, Diesel, Maud

Next Steps

Summary

The initial API was implemented in #44125 and is being used by crates like Rocket and Diesel to emit user-friendly diagnostics. Apart from thorough documentation, I see two blockers for stabilization:

  1. Multi-Span Support

    At present, it is not possible to create/emit a diagnostic via proc_macro that points to more than one Span. The internal diagnostics API makes this possible, and we should expose this as well.

    The changes necessary to support this are fairly minor: a Diagnostic should encapsulate a Vec<Span> as opposed to a Span, and the span_ methods should be made generic such that either a Span or a Vec<Span> (ideally also a &[Vec]) can be passed in. This makes it possible for a user to pass in an empty Vec, but this case can be handled as if no Span was explicitly set.

  2. Lint-Associated Warnings

    At present, if a proc_macro emits a warning, it is unconditional as it is not associated with a lint: the user can never silence the warning. I propose that we require proc-macro authors to associate every warning with a lint-level so that the consumer can turn it off.

    No API has been formally proposed for this feature. I informally proposed that we allow proc-macros to create lint-levels in an ad-hoc manner; this differs from what happens internally, where all lint-levels have to be known apriori. In code, such an API might look lIke:

    val.span.warning(lint!(unknown_media_type), "unknown media type");

    The lint! macro might check for uniqueness and generate a (hidden) structure for internal use. Alternatively, the proc-macro author could simply pass in a string: "unknown_media_type".

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

I'd argue that support for associating warnings with lints should be a separate RFC, and shouldn't block moving forward with unsilenceable warnings (with the expectation that anything to associate warnings with lints would need to be an additional API)

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

Similarly, I'm not sure we actually need to address multi-span support before this API can be stabilized. The proposed change there involves changing some methods to be generic, which is considered a minor change under RFC #1105. It could also be done by changing Span itself to be an enum, rather than having a separate MultiSpan type

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Sep 11, 2018

@sgrif I suppose the question is: should unsilenceable warnings be allowed at all? I can't think of a reason to remove this control from the end-user. And, if we agree that they shouldn't be allowed, then we should fix the API before stabilizing anything to account for this. I'd really rather not have four warning methods: warning, span_warning, lint_warning, lint_span_warning.

Similarly, I'm not sure we actually need to address multi-span support before this API can be stabilized.

Sure, but the change is so minor, so why not do it now? What's more, as much as I want this to stabilize as soon as possible, I don't think enough experience has been had with the current API to merit its stabilization. I think we should implement these two features, announce them broadly so others can play with them, gather feedback, and then stabilize.

It could also be done by changing Span itself to be an enum, rather than having a separate MultiSpan type.

Right, that works too.

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018

I suppose the question is: should unsilenceable warnings be allowed at all? I can't think of a reason to remove this control from the end-user.

I think "having a thing we can ship" is a decent reason, but I also think an API that only supports error/help/note, but not errors is sufficiently useful to ship even without warnings. I'd support doing that if it meant we didn't block this on yet another API -- Mostly I just want to avoid having perfect be the enemy of good here.

Sure, but the change is so minor, so why not do it now?

Because we have a perfectly workable API that's being used in the wild right now that we could focus on stabilizing instead. Typically we always trend towards the more conservative option on this sort of thing, shipping an MVP that's forward compatible with extensions we might want in the future.

I don't think enough experience has been had with the current API to merit its stabilization.

So what needs to happen for that? Should we do a public call for testing? Definitely adding more docs is huge. I suppose it'd be good to see what serde looks like using this API as well.

@SergioBenitez
Copy link
Contributor Author

So what needs to happen for that? Should we do a public call for testing? Definitely adding more docs is huge. I suppose it'd be good to see what serde looks like using this API as well.

Yeah, that's exactly what I was thinking.

Because we have a perfectly workable API that's being used in the wild right now that we could focus on stabilizing instead. Typically we always trend towards the more conservative option on this sort of thing, shipping an MVP that's forward compatible with extensions we might want in the future.

I don't think this is an eccentric proposal in any way. When folks play with this, they should have this feature. In any case, I'll be implementing this soon, unless someone beats me to it, as Rocket needs it.

@sgrif
Copy link
Contributor

sgrif commented Sep 11, 2018 via email

@zackmdavis
Copy link
Member

Maybe it's too late (I'm lacking context here), but is there any hope of unifying proc-macro diagnostics with those emitted by the compiler itself? It seems sad and unmotivated to have two parallel implementations of diagnostics. (Rustc's diagnostics also have a suggestions API (albeit still somewhat in flux) that harbors a lot of promise given the new cargo fix subcommand that it would be nice for the Rocket/Diesel/&c. proc-macro world to also benefit from.)

@Havvy Havvy added A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC labels Sep 12, 2018
@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Sep 12, 2018

@zackmdavis The API being exposed by the diagnostics API in proc_macro today is a refinement of the internal API; they're already quite unified, with minor differences to account for the context in which they are used. The implementation is a thin shell over the internal implementation.

In general, rustcs evolving needs and the proc_macro diagnostics API aim for stability prohibit the two from being identical. This is a good thing, however: rustc can experiment with unstable APIs as much as it wants without being concerned about stability while proc_macro authors can have a stable, concrete API to build with. Eventually, features from the former can makes their way into the latter.

@lambda-fairy
Copy link
Contributor

Maud also uses the diagnostic API. It would benefit from both features described in the summary:

  1. Multi-span support – Currently, the duplicate attribute check emits two separate diagnostics for each error. It would be cleaner to emit a single diagnostic instead.

  2. Lint-associated warnings – We want to warn on non-standard HTML elements and attributes. But we also want to let the user silence this warning, for forward compatibility with future additions to HTML.

@macpp
Copy link

macpp commented Sep 21, 2018

Is there any way to emit warning for arbitrary file? It could be usefull for macros that read additional data from external files (like derive(Template) in https://github.com/djc/askama ) .

If it's not possible, how problematic it is to add to Diagnostics something equivalent to :
fn new_raw<T: Into<String>>(start: LineColumn, end: LineColumn, file: &path::Path, level: Level, message: T) -> Diagnostic ?

@alexcrichton alexcrichton added B-unstable Blocker: Implemented in the nightly compiler and unstable. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Oct 1, 2018
@dhardy
Copy link
Contributor

dhardy commented Oct 31, 2018

Something I find confusing about the current nightly API: does Diagnostic::emit() return? (It appears to do so sometimes but not others; even for errors.)

Currently I must use unreachable!() after cases where I think emit should not return... and sometimes this results in internal error: entered unreachable code while at other times it does not (I can't spot a functional difference between the two cases, except for different spans being used).

In my opinion:

  • either procedural macro functions should be revised to return a Result (probably difficult to do now)
  • or there should be a documented, reliable way of aborting (i.e. fn() -> !) with an error code (via panic!, emit() or whatever; currently panic! is reliable but does not give a nice error message)

@SergioBenitez
Copy link
Contributor Author

@dhardy Diagnostic::emit() should always return.

@dhardy
Copy link
Contributor

dhardy commented Nov 1, 2018

Okay, fair enough; not sure why I some panic messages sometimes and not others.

Then we could do with something that doesn't return; maybe Diagnostic::abort()?

I guess syn::parse::Error::to_compile_error and std::compile_error are what I was looking for.

@lambda-fairy
Copy link
Contributor

Personally I'd prefer not exposing a Diagnostic::abort() method, because it encourages macro authors to bail out at the first error instead of collecting as many errors as possible.

The compiler will not proceed with type checking if your macro emits at least one error, so you can get away with returning nonsense (like TokenStream::empty()) in that case.

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Nov 4, 2018

@macpp It's not possible today. I agree that something like this should exist, but much thought needs to be given to the API that's exposed. The API you propose, for instance, puts the onus of tracking line and column information on the user, and makes it possible to produce a Diagnostic where Rust would make it impossible to do so due to Span restrictions.

The API that I've considered is having a mechanism by which a TokenStream can be retrieved given a file name:

impl TokenStream {
    fn from_source_file<P: AsRef<Path>>(path: P) -> Result<TokenStream>;
}

This would make it possible to get arbitrary (but "well-formed") spans anywhere in the file. As a bonus, it means you can reuse libraries that work with TokenStream already, and implementing this function is particularly easy given that Rust uses something like it internally already. The downside is that you can only use this when the source contains valid Rust tokens, which in practice means that delimiters must be balanced, which seems like a sane restriction.

@macpp
Copy link

macpp commented Nov 4, 2018

Well, of course if you deal with .rs files, then TokenStream::from_source_file is much better solution. However, i want to notice that this works only with valid rust files. To clarify, what i wanted to propose is api that allows me to emit warning for any kind of file - for example if my procedural macro reads some configuration from config.toml, then i want to be able to do something better than

panic("bad configuration in config.toml file: line X column Y")

Unfortunately, i forgot about Span restrictions, so api that i proposed earlier is simply wrong for this use case :/

@SergioBenitez
Copy link
Contributor Author

SergioBenitez commented Nov 4, 2018

@macpp No, that would work for almost any kind of text file. A TokenStream is only beholden to matching delimiters; it need not necessarily be Rust. If it can be used as input to a macro, it would be parseable in this way. It would work just fine for Askama, for instance.

@Arnavion
Copy link

Arnavion commented Nov 4, 2018

The downside is that you can only use this when the source contains valid Rust tokens, which in practice means that delimiters must be balanced, which seems like a sane restriction.

And that strings inside single quotes must contain a single codepoint, which eliminates a lot more things (including TOML files) that have arbitrary strings in single quotes.

Even the "delimiters must be balanced" requires that the file use the same idea of where a delimiter is and isn't valid. For example the source file must agree that a ") sequence does not end a previous ( because the Rust parser ) will treat it as part of a string.

In short, I don't have a problem with having a TokenStream::from_source_file, but I would not push this as a solution to the "diagnostics for arbitrary non-Rust source files" problem.

@joshtriplett joshtriplett added the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Dec 20, 2024
@joshtriplett
Copy link
Member

Marking this as libs-api nominated to evaluate the current state of this API and consider stabilizing some or all of it.

Currently (to the best of my knowledge), proc macros don't have any way to emit warnings. They can kinda emit errors, via panic, but they can't emit multiple errors that way.

I do understand that proc_macro is something we have to maintain a stable API for, forever. (We can add a new crate, but we can't remove the old one.)

However, I think it's safe to say in general that the concept of "emit an error" or "emit a warning" is to some degree something we'll always want in some form, for as long as we have proc macros. We may wish to improve it in the future, but the core concept is fundamental to a compiler.

That doesn't mean we need to stabilize all of this API, exactly as it currently is. But, by way of example, I think the following bits could be safely stabilized as a very simple subset:

  • Span::error and Span::warning (leaving note and help unstable for now)
  • Diagnostic and Diagnostic::emit (but not the rest of Diagnostic, for now)

That would be just enough to emit errors and warnings, using a single Span. There's a lot more that proc macros may want in the future, but I think the above subset should be safe to commit to.

Thoughts?

@jhpratt
Copy link
Member

jhpratt commented Dec 20, 2024

cc @rust-lang/wg-macros ^^

They can kinda emit errors, via panic

It is possible to emit compiler_error!("foo") with the spans being set in a specific way to control what it points to. It's bit of a pain, admittedly.

As to the API, I think it would be worth revisiting the proposal from a few years back that I had a PR open to implement (that languished). Even if not stabilizing a large API all at once, there should at least be a general vision rather than simply stabilizing what happened to be submitted as a PR first. That's not to say that the API is bad, just that it should have more thought put into it.

@joshtriplett
Copy link
Member

It is possible to emit compiler_error!("foo") with the spans being set in a specific way to control what it points to. It's bit of a pain, admittedly.

Fair enough. It sounds like there'd still be value in having a simple error/warning mechanism though.

As to the API, I think it would be worth revisiting the proposal from a few years back that I had a PR open to implement (that languished). Even if not stabilizing a large API all at once, there should at least be a general vision rather than simply stabilizing what happened to be submitted as a PR first. That's not to say that the API is bad, just that it should have more thought put into it.

Do you have a summary of that proposal's API surface?

@jhpratt
Copy link
Member

jhpratt commented Dec 20, 2024

Do you have a summary of that proposal's API surface?

Buried in this issue. See #54140 (comment) and the subsequent few comments. What Sergio proposed is what was implemented (but not merged) in #83363.

As far as I am aware, there hasn't been any meaningful discussion about the diagnostics API since then (the early 2023 comments).

@joshtriplett
Copy link
Member

joshtriplett commented Dec 20, 2024

@jhpratt Sergio's proposal seems reasonable to me, but it does have a substantial API surface. The followup version from Mara has a much smaller surface area, though I agree with Sergio's comment that we'd need to at least specify what order things get emitted in. (I would propose "in order of creation", for simplicity.)

I do think that deferring the Spanned concept seems reasonable to reduce API surface area. I agree that it'd be more ergonomic to not have to call .span(), but let's start with an MVP that we can get to the point of stabilization.

I also think it may make sense to make these functions on Span rather than (or in addition to) global functions in proc_macro. You have to have a span to call them, and I think on balance it's easier to write expr.span().error(...) rather than proc_macro::error(expr.span(), ...). @estebank had some concerns with that approach, though, and I'd like to know what those concerns were.

That said, I'd happily sign off on either approach, in the same spirit of starting with an MVP.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 20, 2024

A copy of Mara's proposed API surface, for reference:

pub fn error(span: impl Spans, message: impl Message) -> DiagnosticHandle;
pub fn warning(span: impl Spans, message: impl Message) -> DiagnosticHandle;
pub fn note(span: impl Spans, message: impl Message) -> DiagnosticHandle;

impl Message for String;
impl Message for &str;
impl Spans for Span;
impl<I: IntoIterator<Item = Span>> Spans for I;

impl Copy for DiagnosticHandle;
impl !Send for DiagnosticHandle;
impl !Sync for DiagnosticHandle;

I'd personally propose Diagnostic instead of DiagnosticHandle, and I'd like to have helper methods on either Span or Spans for the common case, but on balance I think that API would work as an MVP. 👍 for adding it, along with documentation stating that diagnostics are emitted in creation order.

@dhardy
Copy link
Contributor

dhardy commented Dec 20, 2024

Personally, I agree with @SergioBenitez's comments on Mara's proposal, and prefer the API @SergioBenitez proposal (although I care less about whether or not it is required to call emit() explicitly). (Also, I suspect the lint arg to Diagnostic::warning was not supposed to be there.)

There is one point of @SergioBenitez's proposal I dislike however: his Spanned trait is used for both single and multiple spans. Personally I would simplify it to something like syn::Spanned:

// alternative name: ToSpan
pub trait Spanned {
    fn span(&self) -> Span;
}

fn mark_all would thus become:

impl Diagnostic {
    pub fn mark_all(self, item: impl Iterator<Item = impl Spanned>) -> Self;
}

This leaves one gap in the API — joining spans. So lets address that separately:

impl Span {
    pub fn join(&self, other: impl Spanned) -> Option<Span>; // existing unstable method with modified arg type
    pub fn join_all(iter: impl Iterator<Item = impl Spanned>) -> Option<Span>;
    pub fn extend(&self, iter: impl Iterator<Item = impl Spanned>) -> Option<Span>; // optional extra
}

Note: the new Span methods could use Item = Span, though this seems unnecessarily restrictive.


Mostly though, I agree with @joshtriplett that it would be good to get some traction on this. The plan to stabilise only Diagnostic, Diagnostic::emit, Span::warning and Span::error makes sense, and is not incompatible with @SergioBenitez's proposal.

@Qix-
Copy link

Qix- commented Dec 20, 2024

/bikeshedding

ToSpan would imply a fn(self) (consuming) method on first read, which I doubt is what you'd want.

@weiznich
Copy link
Contributor

I would be really happy to see process on this, as the current way to report errors and warnings for proc-macros is really not good.

I would like to add a few points as input from user of proc-macros:

proc macros don't have any way to emit warnings.

That's currently not true. You can generate a warning by using eprintln!(), but that one is not attached to a span. You also could use something like #[deprecated] on some internal generated type to get out a custom message, although that's then technically displayed as deprecated lint warning instead as proc macro warning. That written: I would see emitting warnings as the feature that would be enabled by exposing such an API. Emitting good errors is already somewhat, even if a bit complicated, possible, while emitting good warnings is just impossible.

That brings me to another point: Generating warnings from proc macros is likely very use-full for a lot of use-cases. Nevertheless people likely want to use #[allow]/#[deny] for these warnings, so maybe put these warnings by default into a "lint" group per proc-marco crate? So for diesel-derives you could do at some level (crate level, item level), #[allow(proc_macro::warnings::diesel_derives)] or something like that. This would downstream users allow to ignore or deny these kind of warnings for whatever reason.

I do think that deferring the Spanned concept seems reasonable to reduce API surface area. I agree that it'd be more ergonomic to not have to call .span(), but let's start with an MVP that we can get to the point of stabilization.

An important point to keep in mind here is that most crates are built on top of proc-macro2 and not on top of the plain proc-macro crate. The former provides it's own proc_macro2::Span type. By accepting impl Spanned instead of just proc_macro::Span it becomes much easier to pass in that span type instead.

@joshtriplett
Copy link
Member

joshtriplett commented Dec 22, 2024

When I suggested deferring Spanned/Spans, I mean that we can seal it for now and defer any implementations of it other than for Span. We should still keep the trait abstraction, so that we can extend it later.

Lints, lint groups, and allow/deny/forbid/etc for proc macros would be useful as well, but very much something we can defer from the initial implementation.

And yes, there are ways to get output to the user. I mean that there's no current way to get a well-integrated warning that's associated with the user's code.

@estebank
Copy link
Contributor

I also think it may make sense to make these functions on Span rather than (or in addition to) global functions in proc_macro. You have to have a span to call them, and I think on balance it's easier to write expr.span().error(...) rather than proc_macro::error(expr.span(), ...). @estebank had some concerns with that approach, though, and I'd like to know what those concerns were.

I don't immediately recall what my concerns might have been then, but I'd guess it is about stabilizing an overly constrictive API that doesn't provide as many natural points of expansion (what would the api look like for using a MultiSpan or Vec<Span> + multiple notes + suggestion?), but at the same time I feel like token.span().error(""); is likely to fulfill the needs of most proc-macro developers and its constrained enough that we can commit to maintain it in perpetuity, even if we provide a more comprehensive 2.0 version in the future.

@joshtriplett
Copy link
Member

We discussed this in today's @rust-lang/libs-api meeting.

Several people in the meeting said that they didn't want to ship a design without a lint-style mechanism to associate warnings with a name (and namespace that name), allowing for the possibility of suppressing it in the future. (The suppression mechanism doesn't have to be finished before shipping, but the namespacing mechanism does.) Some in the meeting felt that warnings should require such a name or ID, rather than having it be optional.

We discussed two possible designs for naming and namespacing, in the meeting.

  1. Declare one or more names at compile-time in the proc macro (e.g. as part of the proc_macro_attribute attribute or similar). Possibly allow another crate to re-export those names (e.g. foo re-exporting foo_macros::my_lint as foo::my_lint). Support allow(foo::my_lint). Require the name in the warning API (and error if using an un-declared name).

  2. Handle names at runtime. Have the proc macro make a runtime call to set up warnings, giving a list of warning IDs. Note that this would require some mechanism to namespace the IDs; we didn't have an immediate idea for that in the meeting. (The issue was that people may not want to reference these by the specific proc macro name or crate name, because the proc macro name or crate name may be an implementation detail of some higher-level crate, e.g. foo vs foo_macros.)

Either way, the only path forward that had consensus in the meeting was to have such an ID mechanism for lints, with a namespacing mechanism that ensured two different proc macros wouldn't have conflicting names.

@traviscross
Copy link
Contributor

traviscross commented Jan 7, 2025

@rustbot labels +T-lang +I-lang-nominated

In the libs-api call today, it was discussed how this feature may overlap with matters that concern lang. Certainly some particular designs, e.g. those that would add namespacing within allow(..) attributes, would be of direct concern to us.

But more broadly, we've been thinking about a number of seemingly-related namespacing concerns, e.g. how to namespace attributes applied to fields for derive macros, the tooling namespace, etc. We may want to think holistically about this, or to encourage designs that fall within whatever direction we take here. And of course, many matters of the proc macro API are inherently lang concerns, as they affect the specification of the language and fall outside of what could otherwise be written in stable Rust.

In the meeting, @dtolnay in particular mentioned too that he'd like to see lang involved here, and that the interesting question is perhaps who would be driving this to then present something to both teams.

So let's tag this (and any follow-on stabilization or other significant PRs) for lang along with libs-api, and then let's nominate it so we can discuss briefly to build context on this.

cc @rust-lang/lang

@rustbot rustbot added I-lang-nominated Nominated for discussion during a lang team meeting. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Jan 7, 2025
@dtolnay
Copy link
Member

dtolnay commented Jan 7, 2025

Here is a skeleton of design 1 and design 2 from #54140 (comment). (The semantics are more interesting than the exact spelling used.)

Static at compile-time of proc macro crate

Re-exportable in macro namespace (or a fourth one?), similar to the name resolution of macros. Documentable by rustdoc.

// foo_macros
#[proc_macro_warning]
static ambiguous_thing;

#[proc_macro_warning]
static ambiguities = [crate::ambiguous_thing];

#[proc_macro_derive(Foo)]
pub fn derive_foo(input: TokenStream) -> TokenStream {
    if ... {
        proc_macro::warning(crate::ambiguous_thing, span, "...");
    }
}


// foo
pub use foo_macros::{ambiguities, ambiguous_thing, Foo};
pub trait Foo {...}


// downstream code
#![allow(foo::ambiguous_thing)]

use foo::Foo;

#[derive(Foo)]
...

Static at compile-time of downstream crate

Not re-exportable, similar to the name resolution of inert attributes. Documented manually in crate-level markdown.

// foo_macros
#[proc_macro_derive(Foo)]
pub fn derive_foo(input: TokenStream) -> TokenStream {
    setup_warnings();
    if ... {
        proc_macro::warning("foo::ambiguous_thing", span, "...");
    }
}

fn setup_warnings() {
    proc_macro::register_warning("foo::ambiguous_thing");
    proc_macro::register_warning_group("foo::ambiguities", ["foo::ambiguous_thing"]);
}


// foo
pub use foo_macros::Foo;
pub trait Foo {...}


// downstream code (same as above)
#![allow(foo::ambiguous_thing)]

use foo::Foo;

#[derive(Foo)]
...

@dtolnay dtolnay removed the I-libs-api-nominated Nominated for discussion during a libs-api team meeting. label Jan 7, 2025
@dhardy
Copy link
Contributor

dhardy commented Jan 7, 2025

#[proc_macro_warning]
static ambiguous_thing;

#[proc_macro_warning]
static ambiguities = [crate::ambiguous_thing];

This static ambiguous_thing is a static with type LintId and a fresh value? No, that wouldn't work for static compile-time analysis.

Or is this a type-level construction? But then static amiguities makes no sense (unless it's supposed to have type [&dyn Lint], but then we're back to run-time).

Or something else entirely distinct from Rust's usual type model? This gets confusing.


fn setup_warnings() {
    proc_macro::register_warning("foo::ambiguous_thing");
    proc_macro::register_warning_group("foo::ambiguities", ["foo::ambiguous_thing"]);
}

Why bother with this — why not auto-detect lint names? Is it only for lint groups?

There could be an automatic lint group for the proc-macro crate.


This seems like a considerable amount of complexity for a feature I would not expect to have much usage. Proc-macro warnings are only applicable to macro input which is somehow valid yet incorrect, which surely doesn't occur all that often. Further, I wouldn't expect such lints to have frequent false positives (like, say, Clippy lints).

@dhedey
Copy link

dhedey commented Jan 11, 2025

Just a few quick reflections... If someone adds #![allow(xxx::yyy)], how do they distinguish between two different xxx namespaces with the same name? I can't see a nice way to do this which doesn't add lots of complication.

So I tend to feel that having runtime lint names and no sense of namespace identity or collision mitigation (behind string matching) is likely the simplest for the user, the smallest API and probably the most flexible option.

I also don't immediately see any practical problems with allowing different crates to have a namespace collision:

  • I think in practice, accidental collisions would be quite unlikely.
  • Trolling crates could try to impersonate other lints, but that would just harm their users. And if you've convinced a user to install your crate, you can troll them in much more malicious ways than impersonating a lint.
  • The ability for multiple crates to share a namespace actually seems useful. For example, it would be nice if two different but related macro crates could share the same lint namespace. I've actually got a few use-cases in mind for this in one of my work projects, where we have three different proc macro projects for different specializations of the same idea, and would naturally both want to use the same namespace.

This could give some very simple APIs:

// (A) Simplest
proc_macro::warn(
    "foo::bar",
    span,
    "My warning message",
);

// (B) Slightly more type-safe, but effectively still just a string.
// The lints could be defined centrally in a crate as const or static to avoid repetition.
proc_macro::warn(
    proc_macro::LintId::new("foo::bar"),
    span,
    "My warning message",
);

As for grouping, it wouldn't really make sense to define groups statically in this model, because they could clash...

But you could possibly support adding groups at emit time to this specific lint, by saying "emit foo::bar which should also considered in groups foo::group1 and foo::group2":

// (C) Option supporting groups
proc_macro::warn(
    proc_macro::LintId::new("foo::specific").in_group("foo::group_1").in_group("foo::group_2"),
    span,
    "My warning message",
);

And in terms of use-cases: I'm currently working on a library called preinterpret with a toolkit of simple, composable commands for code-generation (intended for use on its own or inside a declarative macro, and supports e.g. quote and paste functionality)... eventually I have aspirations that it could even be a go-to tool for writing declarative macros, but that's a little way off. In any case, I have been really interested in this topic - and also the span join API - for better error diagnostics.

For what it's worth, runtime lint name resolution would also work better with preinterpret, as users using preinterpret to build their declarative macros could use their own lint names... Something like this:

// This is just example preinterpret syntax, not anything I'm proposing for procedural macro APIs
macro_rules! emit_my_app_warning {
    ($message:literal) => {preinterpret!{
        [!warning! { lint: "my_app::warning", message: $message, spans: [$message], sub_diagnostics: [] }]
    }};
}

@dtolnay
Copy link
Member

dtolnay commented Jan 11, 2025

Simultaneously responding to the previous 2 comments:

Why bother with this — why not auto-detect lint names? Is it only for lint groups?

I tend to feel that having runtime lint names […] is likely the simplest for the user, the smallest API and probably the most flexible option.

The issue with the approach @dhedey has shown, and the reason #54140 (comment) looks different than that, is you've only informed the compiler about a particular lint name if that lint is ever triggered.

Imagine the caller writes:

#![deny(foo::ambigous_thing)]

use foo::Foo;

#[derive(Foo)]
...

Did you catch the typo?

The compiler needs to know whether an arbitrary scoped lint name refers to a valid thing without it being triggered, in order to report unknown_lints.

In my 2 designs (#54140 (comment)), the first one conveys perfect information about what lint names exist. The second one conveys perfect information (that's the reason for register_warning, @dhardy) if both of the following are true:

  1. The macro author correctly registers all their lints from the top of all their macros, including lints not reachable by that specific macro's implementation.
  2. The caller calls at least one macro. It would be silly to depend on a proc macro crate and not call any of its macros, but this can happen in awkward cfg situations where Cargo's [target.'cfg(…)'.dependencies] is underpowered. In such cases, rustc would need to report that your allow/warn/deny attribute is an unknown_lints, and you would be required to switch to: #[cfg_attr(…, deny(foo::…))] matching whatever condition corresponds to at least one of that crate's macros being called.

This static ambiguous_thing is a static with type LintId and a fresh value? No, that wouldn't work for static compile-time analysis.

That's right. You would need to tell me more about why this can't work.

This:

#[proc_macro_warning]
static ambiguous_thing: LintId;

#[proc_macro_derive(Foo)]
pub fn derive_foo(input: TokenStream) -> TokenStream {...}

would expand to something like:

static ambiguous_thing: LintId = proc_macro::LintId::__new("ambiguous_thing");

const _: () = {
    #[rustc_proc_macro_decls]
    #[used]
    static _DECLS: &[proc_macro::bridge::client::ProcMacro] = &[
        proc_macro::bridge::client::ProcMacro::custom_derive("Foo", crate::derive_foo), // same as today
        proc_macro::bridge::client::ProcMacro::warning(crate::ambiguous_thing),
    ];
};

@dhardy
Copy link
Contributor

dhardy commented Jan 11, 2025

This static ambiguous_thing is a static with type LintId and a fresh value? No, that wouldn't work for static compile-time analysis.

That's right. You would need to tell me more about why this can't work.

Resolving static ambiguous_thing as an item at compile-time is possible.

Resolving the contents of static ambiguities = [/* omitted */]; may be, but only since this is non-mutable. I don't think it actually matters though since from what I understand all usage happens after the proc-macro is compiled.


@dtolnay mentioned above that the semantics were more important than "the exact spelling", but I would have found the example a little more comprehensible had more effort been paid to typing above. Thus, I'd propose the following to keep within Rust's existing type models:

// The attribute macro here supplies a definition (and, if necessary, registers the lint identifier with the compiler).
#[proc_macro_warning]
static ambiguous_thing: LintId;

// The attribute macro *might* be required to register the lint group identifier, but it might also be completely unnecessary here.
#[proc_macro_warning]
static ambiguities: &[LintId] = &[crate::ambiguous_thing];

The group might be better typed as follows if we had support for [LintId; _] (see #85077):

#[proc_macro_warning]
static ambiguities: [LintId; _] = [crate::ambiguous_thing];

In particular, I don't think #[proc_macro_warning] should supply the typing information, even if this would be less to write.

@bjorn3
Copy link
Member

bjorn3 commented Jan 11, 2025

// (A) Simplest
proc_macro::warn(
"foo::bar",
span,
"My warning message",
);

I think we should omit the name of the proc_macro from the lint id (so bar rather than foo::bar here). Crates can be renamed on use and in that case I would expect the lint namespace to be renamed as well.

@dtolnay
Copy link
Member

dtolnay commented Jan 11, 2025

@bjorn3 I do not necessarily agree with that. It works that way in design 1 from #54140 (comment) (you refer to the warning through your local name of its crate) but not in design 2 (you refer to the warning through some hardcoded path declared by its crate).

Consider that inert attributes also do not get renamed depending on how you import a crate locally.

// [dependencies]
// my-serde = { package = "serde" }

#[derive(my_serde::Serialize)]
#[serde(deny_unknown_fields)]  // not #[my_serde(...)]
struct Example {}

I do prefer design 1 but both are good IMO.

For the purpose of unblocking a first round of stabilization (#54140 (comment)), as soon as someone sends a PR to implement the following API then this can move forward, even if LintId is currently just ignored and is not usable downstream in allow/warn/deny. The important thing is that we do not allow the creation of a macro-generated warning with no id attached.

use proc_macro::{TokenStream, WarningId};

#[proc_macro_warning]
static ambiguous_thing: WarningId;  // LintId?

#[proc_macro_derive(Foo)]
pub fn derive_foo(input: TokenStream) -> TokenStream {
    if ... {
        // pass crate::ambiguous_thing as an argument to proc_macro::warning
        // or Diagnostic::warning or whatever other entry point
    }
}

@bjorn3
Copy link
Member

bjorn3 commented Jan 11, 2025

Consider that inert attributes also do not get renamed depending on how you import a crate locally.

That is because inert attributes are not namespaced and the attribute name doesn't have any relation with the proc_macro name. Under the proposed interface proc macro lints would be namespaced however. And as such I expect the namespace to be renamed based on the crate name, but not the lint name to be renamed. So in a lint called foo::bar, only the foo part would change depending on which name you import the proc macro with.

@weiznich
Copy link
Contributor

Most of the recent discussion seems to focus around how to declare lint groups for the warning. I've not attended the meeting @joshtriplett wrote about but I would like to ask if it's really required to have a mechanism to declare these groups in such fine way from the beginning. After all the easiest possible solution would be to put all proc-macro emitted warnings in the same lint group, which would users allow to suppress these warnings, although not at the finest possible level. Another solution would be to derive these lint groups automatically based on the proc macro that emitted this lint. So a warning coming from serde::Serialize could then be suppressed by something like #[allow(proc_macro_waring::serde::Serialize)] (name just matching that that brought this proc macro in scope, exact syntax up to discussion, but that's not relevant for the proposal). This would allow to suppress warnings on proc macro level. I agree that there might be use-cases that would want to emit warnings for different purposes, but I would expect that these are not the common case. Most proc-macros don't emit warning or just emit a relatively low number of different warnings (like 1).

Both proposals should be future proof for extending them with an explicit way to declare lint groups, but going with either of the proposals would hopefully unblock the underlying rather important feature of error handling and emitting warnings at all from proc-macros. I would allow to bikesheed the exact syntax of declaring these lint groups later on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-macros-1.2 Area: Declarative macros 1.2 B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: An issue tracking the progress of sth. like the implementation of an RFC I-lang-nominated Nominated for discussion during a lang team meeting. Libs-Tracked Libs issues that are tracked on the team's project board. T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests