Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arguments::as_str(). #74056

Merged
merged 5 commits into from
Jul 18, 2020

Conversation

m-ou-se
Copy link
Member

@m-ou-se m-ou-se commented Jul 5, 2020

There exist quite a few macros in the Rust ecosystem which use format_args!() for formatting, but special case the one-argument case for optimization:

#[macro_export]
macro_rules! some_macro {
    ($s:expr) => { /* print &str directly, no formatting, no buffers */ };
    ($s:expr, $($tt:tt)*) => { /* use format_args to write to a buffer first */ }
}

E.g. here, here, and here.

The problem with these is that a forgotten argument such as in some_macro!("{}") will not be diagnosed, but just prints "{}".

With this PR, it is possible to handle the no-arguments case separately after format_args!(), while simplifying the macro. Then these macros can give the proper error about a missing argument, just like print!("{}") does, while still using the same optimized implementation as before.

This is even more important with RFC 2795, to make sure some_macro!("{some_variable}") works as expected.

@rust-highfive
Copy link
Collaborator

r? @Mark-Simulacrum

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 5, 2020
@m-ou-se m-ou-se force-pushed the fmt-arguments-as-str branch 2 times, most recently from 83cb097 to 8ef7248 Compare July 5, 2020 10:12
@Mark-Simulacrum
Copy link
Member

Seems like making use of this in the macros would probably still be slightly worse for performance than the current approach (unless llvm is good enough to const-prop the condition here, but even so that's a compile time regression).

I do think this is a useful API regardless, though.

r? @sfackler for libs approval of unstable API

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 5, 2020

Seems like making use of this in the macros would probably still be slightly worse for performance than the current approach (unless llvm is good enough to const-prop the condition here, but even so that's a compile time regression).

I added #[inline] to the function. With that, the compiler seems to have no problem evaluating the condition at compile time:

#![feature(fmt_as_str)]

pub fn a() -> bool {
    format_args!("asdf").as_str().is_some()
}

pub fn b() -> bool {
    format_args!("asdf {}", 1).as_str().is_some()
}
scratchpad::a:
 mov     al, 1
 ret

scratchpad::b:
 xor     eax, eax
 ret
A more realistic example
#![feature(fmt_as_str)]

extern "C" {
    fn write_str(_: &str);
}

fn write_fmt(args: core::fmt::Arguments) {
    if let Some(s) = args.as_str() {
        // Avoid allocation, write the &str directly.
        unsafe { write_str(s) };
    } else {
        // Format into a new String to get a &str.
        unsafe { write_str(&args.to_string()) };
    }
}

macro_rules! write {
    ($($t:tt)*) => {
        write_fmt(format_args!($($t)*))
    };
}

pub fn a() {
    write!("hello world");
}
scratchpad::a: ; Doesn't allocate, calls write_str directly.
 lea     rdi, [rip, +, .L__unnamed_1]
 mov     esi, 11
 jmp     qword, ptr, [rip, +, write_str@GOTPCREL]

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 5, 2020

It's a bit of a shame that it has to return &'a str instead of &'static str. A &'static str could be useful to format something into a Cow<'static, str>, which would be a 'no op' in the trivial (no arguments) case. It should be possible, as the string pieces should always come from a static/constant string literal processed at compile time.

Changing Arguments::pieces from a &'a [&'a str] to a &'a [&'static str] seems to work just fine, except for one thing: libcore's panic(&str) implementation. It puts the str to be formatted into pieces instead of args to try to avoid potentially linking in formatting code. Maybe there's another way to avoid that.

(Note that this problem wouldn't have existed if panic!() didn't separately handle the no-arguments case, but used format_args!() in all cases. Unfortunately, that would now be a breaking change, because core::panic!("{}") and core::panic!(&String::from("bla")) are both accepted right now. This is the same problem that #67984 runs into.)

@Amanieu
Copy link
Member

Amanieu commented Jul 16, 2020

We should change as_str to return a &'static str to make this function more useful.

I think that we can remove core::panic!'s special case for &str since it's not an API-breaking change, just an optimization.

However there is embedded code that actually relies on this optimization to work at all due to memory limitations so we need to provide some sort of transition to ensure their code keeps working.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 16, 2020

Sure, will work on that. I'll try to make test case to see if I can reproduce the program overhead in the non-optimized case, and see if I can find another way of avoiding that.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 16, 2020

I think that we can remove core::panic!'s special case for &str since it's not an API-breaking change, just an optimization.

The special case cannot be removed, as it'd break this:

let a = "hello";
core::panic!(a);

But the implementation can be changed to generate a core::fmt::Arguments with 'static strs for the pieces, by putting the text in the arguments list instead. It should be possible to avoid pulling in the str formatting code using a closure calling format_str directly.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 16, 2020

I made it 'static.

The core::panic implementation had to change from

    panic_fmt(fmt::Arguments::new_v1(&[expr], &[]));

to

    panic_fmt(fmt::Arguments::new_v1(&[""], &[fmt::ArgumentV1::new(&expr, |&e, f| f.write_str(e))]));

because the first list can now only contain &'static strs.

Two downsides:

  1. The trivial case (core::panic!("something")) compiles to slightly bigger code. (But still way smaller than with format_args!("{}", expr).)
  2. .as_str() on a PanicInfo's message from core::panic!("something") will now return None, because there's no &'static str available.

1 might be fixed by restructuring the way fmt::Arguments works.

2 can't really be fixed as long as core::panic!() works on non-'static strs. (Note that this would need to be changed anyway for #67984, but probably behind an edition gate.) 2 can be fixed by matching on literal in the macro, as @Amanieu suggested below.

@m-ou-se m-ou-se force-pushed the fmt-arguments-as-str branch from 67c86e7 to e8dcce0 Compare July 16, 2020 19:18
@Amanieu
Copy link
Member

Amanieu commented Jul 16, 2020

For 2, could we try matching a lit in the core::panic! macro and then passing that to a panic_static(s: &'static str)? This should help in the common case where a string literal is used.

@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 16, 2020

Ah, that could work. Will try that tomorrow.

@m-ou-se m-ou-se force-pushed the fmt-arguments-as-str branch from e8dcce0 to 9587084 Compare July 17, 2020 13:11
@m-ou-se
Copy link
Member Author

m-ou-se commented Jul 17, 2020

Updated it. The panic() function now takes a &'static str, and the panic!() macro now only uses that in case of a literal. Non-literal single argument invocations get expanded from panic!(a) to panic!("{}", a) (while checking it is a &str).

So, panic!("hello") now compiles to the exact same as before. let a = "hello"; panic!(a); does result in a bigger code size, but is probably very uncommon.

This means that both of the downsides mentioned above are gone for the panic!("something") case.

@m-ou-se m-ou-se force-pushed the fmt-arguments-as-str branch from 9587084 to c59bef9 Compare July 17, 2020 13:34
src/libcore/fmt/mod.rs Outdated Show resolved Hide resolved
@m-ou-se m-ou-se force-pushed the fmt-arguments-as-str branch 2 times, most recently from 88b14ac to 0094a1e Compare July 17, 2020 14:18
src/libcore/macros/mod.rs Outdated Show resolved Hide resolved
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 17, 2020
…arth

Rollup of 18 pull requests

Successful merges:

 - rust-lang#71670 (Enforce even more the code blocks attributes check through rustdoc)
 - rust-lang#73930 (Make some Option methods const)
 - rust-lang#74009 (Fix MinGW `run-make-fulldeps` tests)
 - rust-lang#74056 (Add Arguments::as_str().)
 - rust-lang#74169 (Stop processing unreachable blocks when solving dataflow)
 - rust-lang#74251 (Teach bootstrap about target files vs target triples)
 - rust-lang#74288 (Fix src/test/run-make/static-pie/test-aslr.rs)
 - rust-lang#74300 (Use intra-doc links in core::iter module)
 - rust-lang#74364 (add lazy normalization regression tests)
 - rust-lang#74368 (Add CSS tidy check)
 - rust-lang#74394 (Remove leftover from emscripten fastcomp support)
 - rust-lang#74411 (Don't assign `()` to `!` MIR locals)
 - rust-lang#74416 (Use an UTF-8 locale for the linker.)
 - rust-lang#74424 (Move hir::Place to librustc_middle/hir)
 - rust-lang#74428 (docs: better demonstrate that None values are skipped as many times a…)
 - rust-lang#74438 (warn about uninitialized multi-variant enums)
 - rust-lang#74440 (Fix Arc::as_ptr docs)
 - rust-lang#74452 (intra-doc links: resolve modules in the type namespace)

Failed merges:

r? @ghost
@bors bors merged commit 9c84c6b into rust-lang:master Jul 18, 2020
@m-ou-se m-ou-se deleted the fmt-arguments-as-str branch July 18, 2020 11:56
@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

I think this PR regressed the unstable feature of panicking during const-eval, see #51999 (comment) for further discussion.

Also, the std::panic macro still looks like this:

macro_rules! panic {
() => ({ $crate::panic!("explicit panic") });
($msg:expr) => ({ $crate::rt::begin_panic($msg) });
($msg:expr,) => ({ $crate::panic!($msg) });
($fmt:expr, $($arg:tt)+) => ({
$crate::rt::begin_panic_fmt(&$crate::format_args!($fmt, $($arg)+))
});
}

Couldn't this benefit from as_str as well?

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

This is even more important with RFC 2795, to make sure some_macro!("{some_variable}") works as expected.

Doesn't the fact that core::panic! has a special case for $msg:literal mean that this macro will not work as expected with RFC 2795?

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

This is even more important with RFC 2795, to make sure some_macro!("{some_variable}") works as expected.

Doesn't the fact that core::panic! has a special case for $msg:literal mean that this macro will not work as expected with RFC 2795?

Yes, allowing that to work has to be a breaking change to the core::panic! macro, because panic!("{}") is currently already accepted. This special case was originally added as an optimization. as_str() removes the need for this special case, as the trivial case can still be handled specially after passing the literal through format_args!(). But removing this arm from the macro will probably only happen as part of an edition change. See #67984 (comment)

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

Couldn't this benefit from as_str as well?

std::panic!() doesn't just accept strings. panic!(123); also works right now. That begin_panic function there accepts all types that are Any + Send:

pub fn begin_panic<M: Any + Send>(msg: M) -> ! {

Would be nice if that wasn't the case. But changing both panic!() macros to only accept format strings would be a breaking change. (See also #67984 (comment) and the rest of that thread.)

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

I see, thanks.

as_str() removes the need for this special case, as the trivial case can still be handled specially after passing the literal through format_args!().

This doesn't currently happen though, it seems? The panic machinery does not use as_str.

But removing this arm from the macro will probably only happen as part of an edition change. See #67984 (comment)

Ah, makes sense. Removing that arm from the macro would also break panicking during CTFE with #[no_std] currently -- this should be noted in the appropriate tracking issue.

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

This doesn't currently happen though, it seems? The panic machinery does not use as_str.

as_str() would help implementing an optimal version of the panic handler, that might be able to avoid allocating for panic!("abc"). Core doesn't define any panic handler. Core's panic!() has always thrown a fmt::Arguments, also in the panic!("abc") case, so this doesn't really change anything. std's panic!() however throws a &str in that case. as_str() removes the need to do that, as throwing a fmt::Arguments in that case can be handled just as efficiently. (Although std::panic!("{}", 1) now throws a String. How that might change in the future is unclear.) (It's a bit of a mess. :( )

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

Oh so the plan is to use this in std's panic handler to avoid the fomat machinery? Or is that just not worth it?

Also, core::panic!("str") does have a special optimization to reduce code size, which I think you ran into above:

// Use Arguments::new_v1 instead of format_args!("{}", expr) to potentially
// reduce size overhead. The format_args! macro uses str's Display trait to
// write expr, which calls Formatter::pad, which must accommodate string
// truncation and padding (even though none is used here). Using
// Arguments::new_v1 may allow the compiler to omit Formatter::pad from the
// output binary, saving up to a few kilobytes.
panic_fmt(fmt::Arguments::new_v1(&[expr], &[]));

But I guess to replace that with as_str, as_str would have to also return Some on format_args!("{}", "literal"), which might be hard to do.

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

This is even more important with RFC 2795, to make sure some_macro!("{some_variable}") works as expected.

Note that this comment isn't about either of the panic!() macros, as they have more/other issues. This is about log/error/etc. macros in other crates, that now specialize for optimization. (For example: https://github.com/rust-lang-nursery/failure/blob/20f9a9e223b7cd71aed541d050cc73a747fc00c4/src/macros.rs#L9-L17)

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

I think I finally understood what I am confused about here... what is the point of the core::panic! macro changes? From what I can tell, the only effect is that core::panic!(foo) where foo: &str now uses a less optimal code path, and it stopped working in CTFE. I don't understand what that helps with, given some edition changes are anyway required for core::panic!("{name}") to work, and even more drastic changes for std::panic!("{name}") (like #74622).

Looks like they were needed because as_str returns a &'static str, but why the static lifetime? Is there a design document or so for this? I suppose one could make an argument that core::panic!(foo) should never have worked in the first place, since the same does not work with write! and format! and friends... and then the string would indeed always be static.

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

Oh so the plan is to use this in std's panic handler to avoid the fomat machinery? Or is that just not worth it?

I don't think this is a plan right now. Might be nice?

Also, core::panic!("str") does have a special optimization to reduce code size, which I think you ran into above:

Yes, that trick makes sure that core::panic!("abc") gives the same fmt::Arguments as format_args!("abc"), and not the same as format_args!("{}", "abc"), to make sure as_str() works in that case. Just like if panic!() didn't handle the single-argument case separately and would've put everything through format_args!().

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

what is the point of the core::panic! macro changes?

Ah, the problem is that this code is accepted: (It shouldn't have, probably. But this is stable now.)

core::panic!(&String::from("123"));

This PR changes the lifetime for the strs fmt::Arguments to 'static, as format_args!() only works on string literals anyway. But that trick/hack you just mentioned creates a fmt::Arguments directly, without format_args!(), so doesn't work for core::panicing with non-static strs anymore. (It probably never should have.) So the trick/hack is no longer applied to arbitary $exprs, but only to $literals.

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

(It shouldn't have, probably. But this is stable now.)

Okay, that's what I figured, thanks. This still seems like something that is worth a more thorough discussion; #51999 (comment) shows that people have been using this so even breaking this on an edition change might be problematic. (Actually in that case the string is static, but it is not a literal...)

Is there a tracking issue for the planned core::panic! changes that you mentioned and that this PR prepares?

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

but why the static lifetime?

See discussion starting here: #74056 (comment)

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

Is there a tracking issue for the planned core::panic! changes that you mentioned and that this PR prepares?

No, unfortunately, it's all part of discussions on other PRs and issues so far, like here: #67984 (comment)

The lang team has asked for a concrete proposal for this (here #67984 (comment)), but I haven't had the time to work on that yet. (And I don't think anybody else did either.)

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

(Actually in that case the string is static, but it is not a literal...)

Yeah, exactly. If core::panic!() could've specialized on &'static str instead on $literal, this would've prevented your problem. (Maybe auto(de)ref specialization can do this. But not sure if the core library should make use of hacks like that. (Nope, specializing on lifetimes is never possible.))

@m-ou-se
Copy link
Member Author

m-ou-se commented Sep 5, 2020

Looks like the constant-evaluation code already assumed that the str to the panic_fn is 'static, even though that has only been the case since this PR:

if Some(def_id) == self.tcx.lang_items().panic_fn()
|| Some(def_id) == self.tcx.lang_items().begin_panic_fn()
{
// &'static str
assert!(args.len() == 1);

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2020

I think that's just a bad comment, nothing in that code breaks if the string has a shorter lifetime.

@jyn514 jyn514 added the T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. label Sep 11, 2020
yvt added a commit to r3-os/r3 that referenced this pull request Mar 22, 2022
…ossible

This involves branching, but as shown in [1], the compiler seems to have
no problem resolving the branch at compile time if optimization is
enabled.

[1]: rust-lang/rust#74056 (comment)
@cuviper cuviper added this to the 1.47.0 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants