Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new SKIP macro for skipping tests at runtime #2360

Merged
merged 9 commits into from
Jan 12, 2023

Conversation

psalz
Copy link
Contributor

@psalz psalz commented Jan 31, 2022

This adds a new SKIP macro for dynamically skipping tests at runtime. The "skipped" status of a test case is treated as a first-class citizen, like "succeeded" or "failed", and is reported with a new color on the console.

This is useful when the conditions for skipping a test are only known at runtime. For example:

TEST_CASE( "some feature does something on GPUs" ) {
    if ( get_gpu_count() == 0 ) {
        SKIP( "no GPUs are available" );
    }
    CHECK( do_something_on_gpu() );
}

The output will then look like this:
image

Like other non-assertion type macros (such as INFO and WARN), skipping is still implemented as an "assertion" internally. I've added a new result type ExplicitSkip (which is currently not treated as a failure) as well as a new skipped counter in Catch::Totals. While this initial implementation was surprisingly simple, I've marked the PR as a draft as there are still several outstanding design questions.

For example, the result type is called an explicit skip because there already exists the notion of a "skipped" test in the current codebase. In fact, there are at least 3 conditions under which tests could be considered as skipped:

  • Tests that are hidden ([.])
  • Tests that are not matched by the provided filter
  • Remaining tests that are not executed after the --abortx limit has been reached

The last one is currently explicitly referred to as "skipped" within the IStreamingReporter::skipTest hook. As far as I can tell, only the Automake reporter is actually handling this one though. I'm wondering whether there is an opportunity here for unifying (some) of these concepts in both how they are handled internally as well as the way they are being reported.

Open TODOs:

  • Determine whether/how the notion of "skipping" should be unified within the codebase and in reporting (see above)
  • How should skipping some sections / generators affect the overall result of a test case?
  • Does having a skipped test mean a non-zero exit code?
  • Add hook for reporters to handle explicitly skipped tests (figure out what to do with existing IStreamingReporter::skipTest)
  • Properly handle skipped tests in the various built-in reporters
  • Consider adding a CLI option for whether skipped tests should be reported or not (similar to -s)
  • Should "skipped assertions" even be reported? The fact that these even exist is a result of how test results are communicated throughout the system (i.e., a "skipped assertion" only leads to a skipped test in Catch::Totals::delta), but the terminology feels a bit weird.
  • Add documentation

Resolves #395.

@codecov
Copy link

codecov bot commented Jan 31, 2022

Codecov Report

Merging #2360 (6309b3c) into devel (52066db) will increase coverage by 0.01%.
The diff coverage is 90.22%.

@@            Coverage Diff             @@
##            devel    #2360      +/-   ##
==========================================
+ Coverage   91.13%   91.15%   +0.01%     
==========================================
  Files         187      187              
  Lines        7634     7691      +57     
==========================================
+ Hits         6957     7010      +53     
- Misses        677      681       +4     

@horenmar
Copy link
Member

horenmar commented Feb 6, 2022

This is going to take awhile to review, I don't think I will really get to this before next weekend.

TEST_CASE( "sections can be skipped dynamically at runtime", "[skipping]" ) {
SECTION( "not skipped" ) { SUCCEED(); }
SECTION( "skipped" ) { SKIP(); }
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see a SECTION after the skipped one as well.

@horenmar
Copy link
Member

I think that skipped tests should not count as failures, nor should they
count as successes. Skipping a test case from within the TEST_CASE should
behave similarly to not running the test at all.

This has some consequences, such as if a test run only invokes skipped tests,
then it should fail (because Catch2 fails test run that runs no test cases),
but as long as there is at least one test case that finished, the return
code should be 0.

However, it might be useful to add a warning (-w) option that would cause
the test run to fail if any tests were skipped, as essentially an "audit"
option.

From this also follows what should happen when a section/generator input
is skipped: it should look like that section/generator input does not
exist. At least as much as possible, if the test case looks like this:

TEST_CASE() {
    CHECK(1 == 2);
    SKIP();
    REQUIRE(1 == 2);
}

then that test case should be counted as failed.

Further examples:

TEST_CASE() {
    SECTION("A") { std::cout << "A\n"; }
    SECTION("B") { SKIP(); }
    SECTION("C") { std::cout << "C\n"; }
}

Should print out "A\n" and "C\n", and similarly

TEST_CASE() {
    SECTION("A") { std::cout << "A"; }
    SECTION("B") {
        SECTION("B1") { std::cout << "B1"; }
        SECTION("B2") { SKIP(); }
    }
    std::cout << "\n";
}

should print out "A\n" and "B1\n", and nothing more. (Generally exits
from the last section in test case are tricky, due to how sections are
discovered and tracked.)

As to the other parts, "skipped assertions" should definitely not be
reported as assertion. However, their test cases should be reported as
skipped. I don't believe we need a flag to show them, as they should be
shown by default in most reporters. I am willing to change my mind on this
though.

Also I have no idea what to do about the existing skipTest uses -- I am not
sure anything actually uses it for anything useful (I am not entirely sold
on --abort being useful either).

@horenmar
Copy link
Member

If the section tracking proves trouble some, I am willing to say that test skips can only happen at the top of a TEST_CASE, but I would rather not, or at least allow

TEST_CASE() {
    auto i = GENERATE( ... );
    if (i % 2 == 1) { SKIP(); }
}

@psalz
Copy link
Contributor Author

psalz commented Feb 21, 2022

Thank you, I will need some time to investigate how to best address all your points.

Also I have no idea what to do about the existing skipTest uses -- I am not
sure anything actually uses it for anything useful (I am not entirely sold
on --abort being useful either).

Would you in principle be open to repurposing that hook for the SKIP macro in a breaking change?

@horenmar
Copy link
Member

Would you in principle be open to repurposing that hook for the SKIP macro in a breaking change?

Yes, but it would also likely need to be renamed and further changed, as the SKIP granularity is at section level, not test case level.

@psalz
Copy link
Contributor Author

psalz commented Apr 13, 2022

Sorry for the delay. After looking back into this now, it seems like not too many things need changing after all.

I've rebased onto current devel and removed the printing of "skipped assertions" from the console reporter.

Skipping Behavior

I think the initial version of this PR already mostly did what you expected in your examples. I've added a few more test cases to better cover these scenarios.

and similarly

TEST_CASE() {
  SECTION("A") { std::cout << "A"; }
  SECTION("B") {
      SECTION("B1") { std::cout << "B1"; }
      SECTION("B2") { SKIP(); }
 }
  std::cout << "\n";
}

should print out "A\n" and "B1\n", and nothing more. (Generally exits
from the last section in test case are tricky, due to how sections are
discovered and tracked.)

Not sure I'm following. If there were a FAIL instead of the SKIP in the last section, it would print A\n, B1\n, \n, right? Because as you say, exiting the last section is tricky. Why would the behavior for SKIP have to be different, then? Without having looked into it in detail, it seems like treating this differently would require a lot of work for arguably little gain, IMO.

If the section tracking proves trouble some,

Again I'm not sure I understand why I would need to do section tracking manually; by treating SKIP similarly to REQUIRE or FAIL (i.e., ResultDisposition::Normal), I don't need any custom section tracking and basically get the behavior you show in your examples for free.

Exit Code

I've adjusted things to return a non-zero exit code (4) when tests were run, but all of them ended up being skipped. Do we want something analogous to --allow-running-no-tests here? E.g., --allow-skipping-all-tests.

I've also added a new test case for this behavior, however I'm not sure whether "ExtraTests" is the right location for it. Also I have no idea what the whole X#-<name>.cpp naming scheme is about. I basically just did whatever the EmptySpecWithNoTestsFails test was doing.

Hooks

Also I have no idea what to do about the existing skipTest uses -- I am not
sure anything actually uses it for anything useful (I am not entirely sold
on --abort being useful either).

Turns out I don't actually have a use case for a hook when SKIPing a test, so I'd rather wait for one to come up before I unnecessarily pollute the API. At the same time we can't keep the existing skipTest hook, as it would be extremely confusing. So my current thinking would be to just remove it. Alternatively we could rename it to abortTest or something.

@michael-hartmann
Copy link

I‘d be happy to see this feature in Catch2 at some point in time. I have a similar use case, namely skipping some test cases at runtime if certain ressources (in my case some attached sensors) are not availabe.

@horenmar
Copy link
Member

horenmar commented Oct 1, 2022

I am finally back around for long enough to go through the larger PRs.

I rebased the PR and will pick it up during this week. Sorry for the long wait 😃

@psalz
Copy link
Contributor Author

psalz commented Oct 3, 2022

Thanks for looking into this again, let me know if you need anything!

@horenmar
Copy link
Member

horenmar commented Oct 3, 2022

@psalz I read through it again today, I think the changes are fine, but the other reporters need to have a better support for skipping.

I tried a few, and the output is not very useful, e.g.

$ ./tests/SelfTest  "tests can be skipped dynamically at runtime" -r tap
# rng-seed: 3838586559
# tests can be skipped dynamically at runtime
** internal error ** 1 -
1..1

internal error as skip message is no good.

$ ./tests/SelfTest  "tests can be skipped dynamically at runtime" -r compact
Filters: tests can be skipped dynamically at runtime
RNG seed: 2657692748
Passed all 0 test cases with 0 assertions.

The 0 passed, all green output is confusing, especially when combined with the return code being 4 (because all selected tests were skipped).

$ ./tests/SelfTest  "tests can be skipped dynamically at runtime" -r xml
<?xml version="1.0" encoding="UTF-8"?>
<Catch2TestRun name="SelfTest" rng-seed="371487568" catch2-version="3.1.0" filters="tests can be skipped dynamically at runtime">
  <TestCase name="tests can be skipped dynamically at runtime" tags="[skipping]" filename="/mnt/c/ubuntu/Catch2-psalz/tests/SelfTest/UsageTests/Skip.tests.cpp" line="13">
    <OverallResult success="true"/>
  </TestCase>
  <OverallResults successes="0" failures="0" expectedFailures="0"/>
  <OverallResultsCases successes="0" failures="0" expectedFailures="0"/>
</Catch2TestRun>

XML reporter should provide all available information, because it implements Catch2-specific output format to be used for machine parsing of the output.

$ ./tests/SelfTest  "tests can be skipped dynamically at runtime" -r Automake
:test-result: XFAIL tests can be skipped dynamically at runtime

I am not sure if this is the best we can do, see below.

In total, there are 8 built-in reporters.
3 of them are Catch2-specific

  • console - I think this one is fine as-is, except for better colour choices. I will work on that later, because I also need to change how some of the colours are defined.
  • compact - We can definitely do better here. The compact reporter output has another related issue (Compact reporter does not handle [!shouldfail] properly #878) that could be tackled at the same time (but you don't have to).
  • XML - The overall XML structure cannot change, but we can extend it trivially. The goal of this reporter is to make the test results machine-parseable, so they need to mention skipping as well.

And 5 implement specific format we do not control. Those might be an issue, because the output format might not support the concept of skipping tests. Nope, they all support skipped tests.

@horenmar
Copy link
Member

horenmar commented Oct 3, 2022

I also went to the original post (this is a very high quality PR btw) and

Consider adding a CLI option for whether skipped tests should be reported or not (similar to -s)

I don't think this is useful. While skips are not counted as errors, I think we should report them by default, as they are an indicator of potential issue. E.g. if the dynamic check for whether the test should be skipped is wrong, the user might end up with a test that never actually runs.

OTOH, there might be some use for an option that turns skipped tests into failures, instead of passes, for similar reasons.

@horenmar
Copy link
Member

horenmar commented Oct 25, 2022

I started looking at the reporter formats.

TeamCity - I think the Ignored tests message matches up with the intent of SKIP, see https://www.jetbrains.com/help/teamcity/service-messages.html#Nested+Test+Reporting, Ctrl+F to "Ignored tests".


Went through all the 3rd party format reporters, and I am pleasantly surprised that all of them support concept of reporting skipped tests. See my previous post with list of reporters for notes.

@horenmar
Copy link
Member

re the flag to turn SKIPs into failures, I thought about it and I think a new warning flag is the simplest (and cleanest) solution

@psalz
Copy link
Contributor Author

psalz commented Nov 3, 2022

Rebased onto #2554. Apologies for being a bit slow on this right now, I will try to find some time to look into the other reporters.

@horenmar
Copy link
Member

horenmar commented Nov 3, 2022

Don't worry, you will have to be waaaay slower before the PR has spent more time waiting on you than it has waiting on me.

@horenmar
Copy link
Member

horenmar commented Nov 4, 2022

Do you have an idea about when you are going to have time to finish this? I am planning next release and wondering whether I should wait for this or not.

@psalz
Copy link
Contributor Author

psalz commented Nov 7, 2022

Tough to say, I have a pretty busy month ahead unfortunately. Maybe I can find some time next weekend. I briefly looked into it yesterday and ran into a couple of issues where I need clarification:

  • As mentioned above, the Automake reporter currently uses the SKIP result to report test cases that were not run after reaching the --abortx limit. Other reporters don't use it but there are some comments talking about how this should be implemented. I propose to instead, in a breaking change, get rid of the IEventListener::skipTest hook. If a test run aborts prematurely, any remaining tests are simply not included in the report.
  • The JUnit, SonarQube and TeamCity reporters currently all use their "skipped" result type to report !mayfails. Since there is presumably no better option to express this, we'll have to use it for both !mayfail and explicit skips going forward.
  • While it doesn't seem to be too difficult to implement skipped tests into most of these reporters, how do I know whether the produced files are still valid? Is there some kind of infrastructure in place to validate that the format is actually correct? Or do I have to set up SonarQube and TeamCity installs etc...? (I'd rather not). I find it pretty ironic by the way that none of these platforms with their XML results don't seem to make use of the only redeeming quality of this awful format in that they provide a schema for validation.

@horenmar
Copy link
Member

horenmar commented Nov 7, 2022

No problem, I'll pull in things I want to have in the next release and plan for this to be in a later one.

  • Sure, skipTest can be marked deprecated, and eventually removed. The removal will take a while though, as I don't plan to have another breaking release soon.
  • Yes. Honestly [!mayfail] is a weird attribute, because the test code runs, but the result is a failure only in the case of hard-crash. This then isn't semantically supported by many formats.
  • My approach to testing these is that if I can't find a validator with reasonable amount of effort, I just compare the output with the spec (if I can even find it, JUnit is terrible about this), and if it looks right, merge it and wait to see if someone complains. 🤷

@psalz
Copy link
Contributor Author

psalz commented Dec 21, 2022

Okay, back with some time to work on this! I've rebased onto current devel and added support for skipped tests into all reporters. Here's some notes on each:

  • Compact The compact reporter now also prints a line for each call to SKIP.
  • XML I've added a new boolean attribute skipped to each section's <OverallResults> as well as a skips count to each test cases' <OverallResult>. Note that a section that fails e.g. a CHECK and then SKIPs is counted as both failed and skipped. The top-level overall results now also report a skips count. Other than that, I've introduced a new <Skip> node, analogous to Failure etc.
  • Automake Automake has support for SKIP results, adding that was simple.
  • JUnit I found two different schemas for the JUnit format online. According to this one, there's only one <skipped> / <failure> / <error> allowed per <testcase>. However, since the current reporter produces multiple of them, I assume it instead follows this schema, which allows any number of <failure> and <error>. Unfortunately that one only allows at most one <skipped>, which is a problem if a test case is both skipped and marked as !mayfail. I've decided to not handle this explicitly, instead producing two <skipped> nodes - let's see if someone complains. The same is true for test cases that skip multiple times for different generator values. See for example the test case "dynamic skipping works with generators". The test case "failing in some unskipped sections causes entire test case to fail" behaves differently for the JUnit reporter, since individual sections are reported as separate test cases, so in this case we'll have one skipped and one failing "test". Btw, there seems to be an inconsistency here, because tests with generators still cause the "tests" count on <testsuite> to be incremented by one for each value, but only a single <testcase> is written.
  • SonarQube As mentioned before, SonarQube doesn't provide a proper schema for their format -- only an example -- and from what I can tell, the existing reporter is not conformant to that. No idea if there can be multiple <skipped> nodes per <testCase>, but analogous to JUnit, that's what we have now.
  • TAP TAP simply prints a list of all assertions, so for SKIPs I'm printing an ok result followed by the # SKIP directive.
  • TeamCity I'm using the testIgnored message, as suggested. I'm not sure however what it means when a test both has a testFailed and testIgnored message, i.e., whether the test will be considered as failed or ignored in that case. I'm currently outputting both when a test case has e.g. a failing CHECK followed by a SKIP (for example in "failed assertions before SKIP cause test case to fail").

@horenmar
Copy link
Member

horenmar commented Dec 21, 2022

Nice, just yesterday I was thinking that when I find some time, I should finish this PR.

Some preliminary notes

  • For TeamCity, I don't see any of the testIgnored messages in the approval tests output. I have no idea what happens when a test both fails and is skipped either, but I think this can be solved later.
  • JUnit: There is already an open issue about multiple fails reported if there are multiple failing CHECKs in a test case. My answer for that was "I might fix it, but use REQUIREs in the meantime if you need it", and I think this is the correct answer here as well.
  • XML reporter: the skipped attribute does not get written as bool, due to ScopedElement handling this badly. I'll fix it in the main branch and you can pick/rebase the commit here. Nevermind on this one, I misread the approval tests.

@psalz
Copy link
Contributor Author

psalz commented Dec 22, 2022

For TeamCity, I don't see any of the testIgnored messages in the approval tests output. I have no idea what happens when a test both fails and is skipped either, but I think this can be solved later.

Right, it seems that's due to the approval test script being a bit too overzealous in stripping out file paths though. You'll notice that the existing TeamCity baselines already don't include any testIgnored lines, or testFailed, for that matter.

@horenmar
Copy link
Member

horenmar commented Jan 1, 2023

Hmmm, you are right. Should be fixed on devel now 😃

@horenmar
Copy link
Member

horenmar commented Jan 1, 2023

I looked at the SonarQube reporter. Do you think the current reporter does not obey the example outside the extended result tag?

It was originally contributed by someone working with SonarQube, so I assume that works 🤷


Otherwise I think this is done after a rebase for the new approvals. (nevermind, docs need to be done first)

psalz and others added 8 commits January 9, 2023 16:06
This adds a new `SKIP` macro for dynamically skipping tests at runtime.
The "skipped" status of a test case is treated as a first-class citizen,
like "succeeded" or "failed", and is reported with a new color on the
console.
Also extend skip tests to cover a few more use cases.
This isn't great, but is better than the deep blue that was borderline
invisible on dark backgrounds. The fix is to redo the colouring
a bit, including introducing light-blue that is actually visible.
@psalz psalz force-pushed the runtime-skipping branch from a411826 to c1e522a Compare January 9, 2023 15:07
@psalz
Copy link
Contributor Author

psalz commented Jan 9, 2023

I've rebased onto current devel and added some documentation. I've also deprecated IEventListener::skipTest by adding a comment to the function, as well as adding it to docs/deprecations.md. Should it also be deprecated in code, e.g. through a @deprecated comment, or even a [[deprecated]] attribute?

I've also reformatted Skip.tests.cpp to be in accordance with the new clang-format settings. I did not update the other code changes (e.g. to reporters), as the corresponding files all still seem to better match the old style.

I looked at the SonarQube reporter. Do you think the current reporter does not obey the example outside the extended result tag?

No I think then it's fine, I was referring to the extended result tag.

docs/deprecations.md Outdated Show resolved Hide resolved
docs/skipping.md Outdated Show resolved Hide resolved
@psalz psalz marked this pull request as ready for review January 10, 2023 12:25
@horenmar
Copy link
Member

I think this is done. I wrote some follow ups into my TODO list, e.g.

Should it also be deprecated in code, e.g. through a @deprecated comment, or even a [[deprecated]] attribute?

Yes, but it needs to be configurable, so the deprecation can be disabled via preprocessor, letting the users deal with this when they can. I'd prefer that to be done later, especially since there is more stuff in Catch2 that will get this treatment.


Do you want to clean up the commits, or are you fine with me squashing the PR?

@psalz
Copy link
Contributor Author

psalz commented Jan 12, 2023

Fine with me, squash away! :)

@horenmar horenmar merged commit d548be2 into catchorg:devel Jan 12, 2023
@psalz psalz deleted the runtime-skipping branch January 12, 2023 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow skipping tests at run-time.
3 participants