Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "modes" to compiletest, for running all tests with NLL enabled and comparing with master #48879

Closed
1 task
pnkfelix opened this issue Mar 9, 2018 · 14 comments · Fixed by #49293
Closed
1 task
Labels
A-compiletest Area: The compiletest test runner A-compiletest-compare-modes Area: compiletest compare-modes A-testsuite Area: The testsuite used to check the correctness of rustc C-enhancement Category: An issue proposing an enhancement or a PR with one. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. NLL-diagnostics Working towards the "diagnostic parity" goal T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804

Comments

@pnkfelix
Copy link
Member

pnkfelix commented Mar 9, 2018

The NLL team has been trying to find a good workflow for evaluating the discrepancies between AST borrowck and NLL-based borrowck.

  • Past workflows have developed and used things like -Z borrowck=compare and the https://github.com/pnkfelix/nll-probe tool, but here I'm going to try to avoid a deep dive into how those have failed to satisfy our needs.

Our goal here: We want a way to track the current state of NLL, including the cases where there are known deviations, in a manner that allows immediate analysis of that state to determine things like:

  • "how many test cases does NLL deviate from AST borrowck",
  • "what do the deviations look like", or
  • "does this deviation represent a bug that needs to be fix? Or is it an improvement on the AST borrowck?"

Here's a new proposal for a (hopefully) relatively small change to compiletest that should yield an easier workflow to answer questions like those above, at least for the ui/ subset of our tests.

  1. Context: The ui/ tests are set up so that each test consists of an input $file.rs, and a set of expected outputs in $file.stderr (compilation error messages) and $file.stdout (non erroneous compiler output; rarely used). (For more info, see this chapter in the rustc-guide.)
  2. Add a "mode" argument to compiletest, encoded either as a command line parameter or as an environment variable, or both. (The first mode we'll support will be "nll", which tells compiletest to pass -Z nll in addition to any other flags when invoking rustc, at least for the ui/ tests).
  3. When running under a particular mode, if there is $file.$mode.stderr file, then this file will be used as the source of "acceptable output". If there is no such file, then compiletest will fallback to the regular filename $file.stderr.
  4. One complication: UI tests also include "inline" comments of the form //~ ERROR that indicate what error is expected on which line (these are mildly redundant with the stderr files above). Because messages may differ in the mode M, but we don't want to edit the sources too much, compiletest should just ignore the //~ ERROR annotations when running in a particular mode. We will still see the errors that occur from the stderr output, it's just less convenient.
  5. To ensure that we are tracking discrepancies somewhere, whenever there is a $file.$mode.stderr, then some tool (probably compiletest, but maybe tidy?) will be responsible to checking that $file.rs somewhere contains a comment that explains the source of the discrepancy. This could be a specially formatted FIXME, if the new behavior seems worse than before:
// $mode FIXME(#123) -- summary of discrepancy caused by NLL bug with corresponding issue num

or a "YAYME" comment for when the new behavior is an improvement:

// $mode YAYME(#123) -- summary of a beneficial discrepancy

(presumably the ticket linked by YAYME would just be the tracking issue for NLL or whatever other mode is being tested).

Benefits of the proposed system:

  • To find the cases that have discrepancies for nll, one can use ls *.nll.stderr
  • To find what a given discrepancy looks like, one can use diff onetest.stderr onetest.nll.stderr
  • To see if a discrepancy is a bug or not, grep for FIXME or YAYME in the .rs files.
    • This workflow is perhaps not ideal; @nikomatsakis has pointed out that it might be nicer if these comments somehow lived in the *.$mode.stderr files.

Open Questions (to be resolved by implementor)

  • What should compiletest do about occurrences of //~ ERROR in the source text? In particular, should it check that the error output still has those cases, even when running under a given mode?
    • The current inclination of @pnkfelix and @nikomatsakis is that it is actually okay for compiletest to ignore the //~ ERROR annotations when running under a given mode. The reasoning here is this: the //~ ERROR annotations will already get checked by compiletest runs that don't have a mode. We probably don't want to force an error when there's a discrepancy when running under a given mode; any discrepancies, including any of those errors disappearing, should be accounted for in the linked // $mode FIXME/YAYME issue, and we want to allow them to disappear or differ.
@nikomatsakis nikomatsakis added A-testsuite Area: The testsuite used to check the correctness of rustc T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804 WG-compiler-nll labels Mar 9, 2018
@nikomatsakis
Copy link
Contributor

I'm marking this as belonging to both @rust-lang/wg-compiler-nll and @rust-lang/wg-traits -- this is because we are both going to need this tool! In general, having a mode like this will be useful anytime that we plan to make major alterations to some core component of the system.

@nikomatsakis nikomatsakis changed the title Add rustc NLL mode to compiletest, for driving comparisons between borrowck modes Add "modes" to compiletest, for running all tests with NLL enabled and comparing with master Mar 9, 2018
@pnkfelix pnkfelix added the E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. label Mar 9, 2018
@nikomatsakis
Copy link
Contributor

Mentoring instructions

I'm going to assume we will use a command line option to drive this new mode. Let's call "compare-mode". Then we can run the tests with ./x.py test foo --test-args --compare-mode nll or whatever. The compile test harness as a configuration that is loaded here:

let config = parse_config(env::args().collect());

So we have to extend parse_config with some new option --compare-mode or whatever that takes a string.

pub fn parse_config(args: Vec<String>) -> Config {

That comparison mode will be translated to some arguments to rustc. Those arguments are arranged in this function:

fn make_compile_args(&self, input_file: &Path, output_file: TargetLocation) -> Command {

Probably it's best we allow a mapping between the comparison mode and the arguments (that mapping can just be hardcoded into this function though). For example, NLL at least currently really wants three flags added:

match self.config.comparison_mode {
    Some(ComparisonMode::NLL) => {
        rustc.args(&["-Znll", "-Zborrowck=mir", "-Ztwo-phase-borrows"]);
    }
    None => { }
}

This is where the code currently finds the expected files. We'll want to alter these functions to take the comparison mode into account. That's probably a new option to be added, along with revision.

let expected_stderr_path = self.expected_output_path(UI_STDERR);
let expected_stderr = self.load_expected_output(&expected_stderr_path);
let expected_stdout_path = self.expected_output_path(UI_STDOUT);
let expected_stdout = self.load_expected_output(&expected_stdout_path);

Right now, if the revision is Some, then we look for $test.$rev.stderr, I think we want to look for $test.$rev.$mode.stderr in that case, so that we can always do ls *.nll.stderr to get all things with a given mode. (Or maybe we should make it .mode_$mode.stderr or something more distinctive?)

Anyway, we'll have to alter this to search for the expected output with the given mode and -- if not found -- to then try with a mode of None.

Finally, to skip checking for //~ ERROR comments, if a comparison mode is set, we just don't do any of this stuff:

if !expected_errors.is_empty() || !proc_res.status.success() {
// "// error-pattern" comments
self.check_expected_errors(expected_errors, &proc_res);
} else if !self.props.error_patterns.is_empty() || !proc_res.status.success() {
// "//~ERROR comments"
self.check_error_patterns(&proc_res.stderr, &proc_res);
}

Uuuuh at that point we are basically done with the compiletest changes, right?

Then the last step is to alter tidy to enforce the FIXME convention, but I don't know much about that. We'll figure that out when we get there.

@kennytm
Copy link
Member

kennytm commented Mar 9, 2018

In compiletest there is already the concept of "revisions" which y'all are already using in compile-fail tests. I think it's better we extend this to support UI tests than layering an NLL-specific "modes" on top of it.

Edit: Disregard, that was #48878.

@pnkfelix
Copy link
Member Author

pnkfelix commented Mar 9, 2018

@kennytm this is different interface from revisions. Revisions are encoded with flags on a test by test basis. A mode is given more privileged status within compiletest. We explicitly do not want to add a revision for every test to represent running with and without NLL.

(There's some chance that code might be shared within compiletest between the support for revisions and that for modes, but in terms of UI, the interface is different here.)

@nikomatsakis
Copy link
Contributor

Also, @kennytm, I opened this issue regarding extending revisions to UI tests (#48878). That said, I think they already work, and we're just not using them. I've got to do some local tests.

@nikomatsakis
Copy link
Contributor

I think though we will eventually want to specify the intersection of the "comparison mode" with a revision. There are a lot of tests that are already encoded with lexical and nll modes, for example -- in those cases, if we have a revision named nll and we are in comparison mode nll, it'd be sort of nice to just ignore the test, or do something smart. But I guess we can leave that for later.

@cuviper cuviper added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Mar 10, 2018
@nikomatsakis nikomatsakis added this to the NLL: diagnostic parity milestone Mar 14, 2018
@nikomatsakis nikomatsakis added the NLL-diagnostics Working towards the "diagnostic parity" goal label Mar 14, 2018
@memoryleak47
Copy link
Contributor

I'd like to try this one. :)

@nikomatsakis
Copy link
Contributor

@memoryleak47 that would be super duper. @pnkfelix and I were just saying that this remains pretty high priority. Let me know if I can help in any way. (Of course there are already mentoring instructions.)

bors added a commit that referenced this issue Apr 6, 2018
… r=pnkfelix

Add compiletest `--compare-mode nll` option

Before implementing the tidy stuff, I'd appreciate if someone reviews the changes so far.
This is my first non-trivial pull request, so I could really use some feedback. :)
closes #48879.

r? @nikomatsakis
@pnkfelix
Copy link
Member Author

pnkfelix commented Apr 9, 2018

I'm going to actually reopen this issue as a way to track when we've gotten this compare-mode into a state where we actually have files checked in that we are referencing against.

(I just discovered that tidy complains about a augemented-assignments.nll.stderr file that I had lying around, so we clearly cannot yet reach that point without some more work.)

It might be good to even keep this issue open until we have bors set up to be running compare-more=nll against the ui tests on some dedicated target like linux.

@pnkfelix pnkfelix reopened this Apr 9, 2018
@pnkfelix
Copy link
Member Author

Tidy fix is #49844

bors added a commit that referenced this issue Apr 11, 2018
…tsakis

Blindly checkpoint status of NLL mode ui tests

This takes the next (and potentially final?) step with #48879.

Namely, this PR got things to the point where I can successfully run `compiletest` on `src/test/ui` with `--compile-mode=nll`.

Here are the main pieces of it:

 1. To figure out how to even run `compiletest` normally on the ui directory, I ran `x.py test -vv`, and then looked for the `compiletest` invocation that mentioned `src/test/ui`.
 2. I took the aforementioned `compiletest` invocation and used it, adding `--compile-mode=nll` to the end. It had 170 failing cases.
 3. Due to #49855, I had to edit some of the tests so that they fail even under NLL, via `#[rustc_error]`. That's the first commit. (Then goto 2 to double-check no such tests remain.)
 4. I took the generated `build/target/test/foo.stderr` file for every case that failed, and blindly copied it to `src/test/foo.nll.stderr`. That's the second commit.
 5. Goto 2 until there were no failing cases.
 6. Remove any stamp files, and re-run `x.py test` to make sure that the edits and new `.nll.stderr` files haven't broken the pre-existing test suite.
@pnkfelix
Copy link
Member Author

Once #49900 lands, this issue will be almost completely resolved; I think the only task I could imagine beyond that would be to expand #49900 (and also compare-mode?) to work on all the tests, not just the ones in src/test/ui.

bors added a commit that referenced this issue Apr 19, 2018
…tsakis

Add src/test/ui regression testing for NLL

This PR changes `x.py test` so that when you are running the `ui` test suite, it will also always run `compiletest` in the new `--compare-mode=nll`, which just double-checks that when running under the experimental NLL mode, the output matches the `<source-name>.nll.stderr` file, if present.

In order to reduce the chance of a developer revolt in response to this change, this PR also includes some changes to make the `--compare-mode=nll` more user-friendly:

 1. It now generates nll-specific .stamp files, and uses them (so that repeated runs can reuse previously cached results).
 2. Each line of terminal output distinguishes whether we are running under `--compare-mode=nll` by printing with the prefix `[ui (nll)]` instead of just the prefix `[ui]`.

Subtask of #48879
@nikomatsakis
Copy link
Contributor

I think we can call this done for now until we have a concrete extension in mind.

@pnkfelix
Copy link
Member Author

namely @nikomatsakis and @pnkfelix agreed that any remaining work here is best spent on resolving #44844 (plus porting the run-pass/ tests to ui/, which should probably be filed as another issue).

@nikomatsakis
Copy link
Contributor

In particular closed in favor of #44844

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-compiletest Area: The compiletest test runner A-compiletest-compare-modes Area: compiletest compare-modes A-testsuite Area: The testsuite used to check the correctness of rustc C-enhancement Category: An issue proposing an enhancement or a PR with one. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. NLL-diagnostics Working towards the "diagnostic parity" goal T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-traits Working group: Traits, https://internals.rust-lang.org/t/announcing-traits-working-group/6804
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants