Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build dependencies without codegen ("cargo check") where possible #1057

Closed
RalfJung opened this issue Nov 17, 2019 · 12 comments · Fixed by #1136
Closed

Build dependencies without codegen ("cargo check") where possible #1057

RalfJung opened this issue Nov 17, 2019 · 12 comments · Fixed by #1136
Labels
A-cargo Area: affects the cargo wrapper (cargo miri) C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps

Comments

@RalfJung
Copy link
Member

It would be great if we could build libstd as well as dependent crates (in cargo miri) without codegen ("check" mode). This would (a) save time, (b) avoid having to generate MIR that works both in Miri and when compiled normally, and (c) hopefully drastically simplify cross-building.

@Aaron1011 looked into this mostly due to (b), as this turned into quite a challenge for unwinding. I am copying from his post:


The long-term solution is to switch to using cargo check. This completely disables codegen in a way that's built into the compiler. We will no longer need to worry about any assumptions made by the codegen backend, since that code will simply never be executed.

However, this turned out to be more complicated than I anticipated. There are several moving parts here:

  1. Making xargo run cargo check

Currently, xargo unconditionally uses cargo build when compiling libstd. I've opened a PR to add a cargo_mode = "check" option to Xargo.toml: japaric/xargo#267

  1. Making cargo-miri run cargo check.

This is complicated by the fact that we currently run cargo rustc in order to pass arguments to the last (and only to the last) invocation of rustc. Sadly, there appears to be no way to combine cargo check and cargo rustc - you can either use cargo rustc and have no way to skip codegen, or use cargo check and have no way to pass final-crate-specific rustc arguments.

To work around this issue, I've added a hack to serialize arguments to an environment variable. When we compile the final crate, we deserialize these arguments from the environment variable, and manually pass them to the rustc invocation.

  1. Making build dependencies use the proper libstd.

This is by far the trickiest part of this entire PR. When Cargo invokes our cargo-miri wrapper, we have three cases to worry about:

  1. Build dependencies (including build scripts): We pass through all arguments completely unmodified to rustc. Miri does not interact in any way with build scripts, so we want to treat them as if we were doing a normal run of cargo.

  2. Normal dependencies: We add our custom sysroot, but still invoke rustc. Since we are in cargo check mode, this will cause rustc to produce metadata for the normal dependenices of our runtime crate, using our custom libstd.

  3. The target itself (e.g. a test or a binary): We invoke miri, and actually begin execution.

Handles these three cases ensures that build dependencies are built using the normal platform libstd (as if cargo-miri did not exist), while normal dependencies and our target crate are built against our custom libstd).

Unfortunately, distinguishing between these three cases is a huge pain. I'm currently relying on the following tricks:

  1. Use CARGO_MANIFEST_DIR to detect when our target crate is being built.

The CARGO_MANIFEST_DIR is set by Cargo to the directory containing the manifest of the package currently being built. During the initial, non-wrapper invocation of cargo miri (e.g. when the user types cargo miri run on the command line), we determine the manifest directly for the crate they are building. We then compare this to CARGO_MANIFEST_DIR when we are being invoked by Cargo as a wrapper.

However, this fails to distinguish between a build.rs and the actual compilation, since both use the same manifest directory. This brings us to the second trick.

  1. Inspect the --emit= flag passed to rustc by Cargo.

This trick relies on the fact that we are using cargo check to build the crate. When Cargo compiles a build dependency, the --emit= flag will always contain link- this is because we always need to produce a runnable binary for build scripts.

When we are building normal dependencies, the --emit= flag will not contain link. This is how cargo check tells rustc not to perform codegen - the --emit= flag will be --emit=dep-info,metadata.

By checking for the presence of link, we can determine whether or not Cargo is trying to compile a build dependency. Note that the same crate could theoretically be build as both - e.g. you could add [dependencies] cc=x.y.z and [build-dependencies] cc=x.y.z to your Cargo.toml.

Adding more information to Cargo

While I believe assumptions behind the above Cargo hacks are fairly sound, this is not really a viable long-term solution. For example, Cargo could choose to stop passing the --emit flag if rustc would have it default to what Cargo already wanted.

Ideally, Cargo would set an environment variable to let us know which of the three cases we are in - target crate, build dependency, or normal dependency. I'm currently working on a PR that does just that.

Conclusion

I think our best path forward is to:

  1. Land cargo check support in xargo
  2. Merge some form of this PR, possible in multiple pieces. Depending on how long review takes, it might make sense to merge the eh_catch_typeinfo hack into rustc so that nightly users can have a (somewhat) working Miri again
  3. Work with the Cargo team to expose the information we need in a cleaner way. I think this information could be of use to other Cargo wrappers. In particular, build dependencies have very different semantics from regular dependencies (e.g. they will be built for a different target during cross-compilation), but wrappers have no clean way of determining which they are building.
@RalfJung RalfJung added C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps A-cargo Area: affects the cargo wrapper (cargo miri) labels Nov 17, 2019
@RalfJung RalfJung changed the title Build dependencies without codegen where possible Build dependencies without codegen ("cargo check") where possible Nov 17, 2019
@RalfJung
Copy link
Member Author

So, as already mentioned I generally agree that we want "cargo check". However, I am not quite sure about the approach you chose to distinguish, inside cargo miri, the three cases. As @bjorn3 mentioned, when --target is set, cargo already does the work for us of distinguishing "build dependencies" host architecture) from "normal" dependencies (target architecture); I think we should use that. That probably involves using RUSTFLAGS for some things where we currently immediately inject arguments in the rustc wrapper.

Right now, we are using the presence of the flags passed via cargo rustc -- flags to distinguish the final binary (that we want to run) from everything else. What is the reason for you to switch to CARGO_MANIFEST_DIR for that?

Also @ehuss is there any chance cargo could tell us a bit more ("inside" cargo rustc) about what the crate is being built for?

@Aaron1011
Copy link
Member

What is the reason for you to switch to CARGO_MANIFEST_DIR for that?

There's no way to combine cargo check and cargo rustc, so if we're doing a cago check build of the target crate, we need some other way to detect when we get invoked for the final crate.

@RalfJung
Copy link
Member Author

But we could do cargo rustc and adjust the flags passed to rustc such that these are check-only builds, right?

@Aaron1011
Copy link
Member

Aaron1011 commented Nov 17, 2019

@RalfJung: I tried that at first - it turned out to require even more hacks that this current PR does.

@RalfJung
Copy link
Member Author

How do check mode and -Zno-codegen interact?

Using cargo check will also require a careful check of things like crates with multiple binaries and cargo miri test to make sure what we do is at least not less sensible than before. We do not have a test suite of such cases.

Also, what about using the crate type (bin vs lib) to distinguish things? Assuming we already recognize build dependencies because we use --target and RUSTFLAGS; binary crates should be the ones we actually want to run.

@Aaron1011
Copy link
Member

How do check mode and -Zno-codegen interact?

I gave up on -Z no-codegen after seeing it produce an enormous number of linker errors on Windows. My guess is that it broke at some point in the past, and no one noticed.

Also, what about using the crate type (bin vs lib) to distinguish things? Assuming we already recognize build dependencies because we use --target

Does that work when you use --target, but specify whatever target would otherwise have been used? If not, is there some special Miri target we should use?

@bjorn3
Copy link
Member

bjorn3 commented Nov 17, 2019

Does that work when you use --target, but specify whatever target would otherwise have been used?

Yes, cargo is dumb :)

@Aaron1011
Copy link
Member

Yes, cargo is dumb :)

Things like that are why I'd (eventually) like to get Cargo to explicitly give us the information we need via an environment variable. I'm worried that Cargo could accidentally or deliberately break any of the weird quirks we might want to rely on, leaving us with no path forward.

@RalfJung
Copy link
Member Author

RalfJung commented Nov 17, 2019

Indeed, cargo build --target $host is not the same as cargo build. In fact, the --target variant is in some sense more sane as it properly distinguishes intermediate artifacts (build scripts) from the final artifact. Also see rust-lang/cargo#4423.

@bjorn3
Copy link
Member

bjorn3 commented Nov 17, 2019

I'm worried that Cargo could accidentally or deliberately break any of the weird quirks we might want to rely on, leaving us with no path forward.

Xargo and rustc's build system currently depend on this to prevent build dependencies (linked to a different libstd) from becoming part of the sysroot.

@Aaron1011
Copy link
Member

Aaron1011 commented Nov 17, 2019

Xargo and rustc's build system currently depend on this to prevent build dependencies (linked to a different libstd) from becoming part of the sysroot.

Ah, okay. If we're already depending on that, I guess it doesn't matter if we add another dependency on it (at least for the time being).

I'll see if I can update check to use that.

@RalfJung
Copy link
Member Author

@Aaron1011's xargo-check is now available in the latest released xargo.

Aaron1011 added a commit to Aaron1011/miri that referenced this issue Jan 1, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Aaron1011 added a commit to Aaron1011/miri that referenced this issue Jan 1, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Aaron1011 added a commit to Aaron1011/miri that referenced this issue Jan 9, 2020
Fixes rust-lang#1057

I'm using my original approach from PR rust-lang#1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
Aaron1011 added a commit to Aaron1011/miri that referenced this issue Jan 9, 2020
Fixes rust-lang#1057

Since we are no longer using "cargo rustc", we now use the rustc
arguments passed by Cargo to determine whether we are building a
build dependency, normal dependency, or "target" (final binary or test)
crate.
RalfJung pushed a commit to Aaron1011/miri that referenced this issue Feb 24, 2020
Fixes rust-lang#1057

Since we are no longer using "cargo rustc", we now use the rustc
arguments passed by Cargo to determine whether we are building a
build dependency, normal dependency, or "target" (final binary or test)
crate.
bors added a commit that referenced this issue Feb 24, 2020
Use 'cargo check' to build the sysroot and target crate

Fixes #1057

I'm using my original approach from PR #1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
bors added a commit that referenced this issue Feb 24, 2020
Use 'cargo check' to build the sysroot and target crate

Fixes #1057

I'm using my original approach from PR #1048. Ideally, we would
distinguish between build-deps/dependencies/'final crate' via a
different approach (e.g. the target directory). However, I
haven't been able to get that to work just yet.

However, everything should be working with the approach I'm using. At a
minimum, we can use this PR to verify that everything works as expected
when we don't actually produce native build outputs.
@bors bors closed this as completed in e530829 Feb 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-cargo Area: affects the cargo wrapper (cargo miri) C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants