Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test that runs really fast is indistinguishable from a filtered test in the test.xml. #3201

Closed
chaoren opened this issue Jun 15, 2017 · 26 comments
Assignees
Labels
good first issue P2 We'll consider working on this in future. (Assignee optional) type: feature request

Comments

@chaoren
Copy link
Contributor

chaoren commented Jun 15, 2017

Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.

Description of the problem / feature request / question:

If asking a question or requesting a feature, also tell us about the underlying problem you're trying to solve.

Test that runs really fast is indistinguishable from a filtered test in the test.xml.

If possible, provide a minimal example to reproduce the problem:

$ cat BUILD
java_test(
    name = "FooBarTest",
    srcs = ["foo/bar/FooBarTest.java"],
    test_class = "foo.bar.FooBarTest",
)

$ cat foo/bar/FooBarTest.java
package foo.bar;

import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.JUnit4;

@RunWith(JUnit4.class)
public class FooBarTest {
  @Test
  public void testFoo() {}

  @Test
  public void testBar() {}
}

$ bazel test :FooBarTest && cat bazel-testlogs/FooBarTest/test.xml                                                                                                                                          
INFO: Found 1 test target...
Target //:FooBarTest up-to-date:
  bazel-bin/FooBarTest.jar
  bazel-bin/FooBarTest
INFO: Elapsed time: 0.693s, Critical Path: 0.38s
//:FooBarTest                                                            PASSED in 0.4s

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
<?xml version='1.0' encoding='UTF-8'?>
<testsuites>
  <testsuite name='foo.bar.FooBarTest' timestamp='1970-04-19T04:54:55.738Z' hostname='localhost' tests='2' failures='0' errors='0' time='0.003' package='' id='0'>
    <properties />
    <testcase name='testBar' classname='foo.bar.FooBarTest' time='0.003' />
    <testcase name='testFoo' classname='foo.bar.FooBarTest' time='0.0' />
    <system-out />
    <system-err />
  </testsuite>
</testsuites>

$ bazel test :FooBarTest --test_filter=foo.bar.FooBarTest#testBar && cat bazel-testlogs/FooBarTest/test.xml                                                                                                 
INFO: Found 1 test target...
Target //:FooBarTest up-to-date:
  bazel-bin/FooBarTest.jar
  bazel-bin/FooBarTest
INFO: Elapsed time: 0.758s, Critical Path: 0.39s
//:FooBarTest                                                            PASSED in 0.4s

Executed 1 out of 1 test: 1 test passes.
There were tests whose specified size is too big. Use the --test_verbose_timeout_warnings command line option to see which ones these are.
<?xml version='1.0' encoding='UTF-8'?>
<testsuites>
  <testsuite name='foo.bar.FooBarTest' timestamp='1970-04-19T04:55:02.061Z' hostname='localhost' tests='2' failures='0' errors='0' time='0.003' package='' id='0'>
    <properties />
    <testcase name='testBar' classname='foo.bar.FooBarTest' time='0.003' />
    <testcase name='testFoo' classname='foo.bar.FooBarTest' time='0.0' />
    <system-out />
    <system-err />
  </testsuite>
</testsuites>

In both cases the result of testFoo is just

<testcase name='testFoo' classname='foo.bar.FooBarTest' time='0.0' />

it makes it hard for us to display this in an IDE.

Environment info

  • Operating System: GNU/Linux

  • Bazel version (output of bazel info release): release 0.5.1

  • If bazel info release returns "development version" or "(@Non-Git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD):

Have you found anything relevant by searching the web?

(e.g. StackOverflow answers,
GitHub issues,
email threads on the bazel-discuss Google group)

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

@meteorcloudy meteorcloudy added category: misc > testing category: rules > java type: feature request P2 We'll consider working on this in future. (Assignee optional) labels Jun 16, 2017
@meteorcloudy
Copy link
Member

@damienmg Is there a way to distinguish these two situations?

@damienmg
Copy link
Contributor

I don't think so. /cc @iirina because that's from the Java Test Runner

@chaoren
Copy link
Contributor Author

chaoren commented Jun 19, 2017

Blaze sets an extra status='run' flag, I'm wondering if bazel could do the same.

@iirina
Copy link
Contributor

iirina commented Jun 19, 2017

Right now the ant xml writer doesn't display if the test was run or not, but it can be change to do so.

@kcooney
Copy link

kcooney commented Aug 10, 2017

The XML schema used by surefire is here:

http://maven.apache.org/surefire/maven-surefire-plugin/xsd/surefire-test-report.xsd

Possibly the <skipped> element could be used here. Surefire supports filtering tests on the command line (see http://maven.apache.org/surefire/maven-surefire-plugin/examples/single-test.html) so I suggest looking at what test XML they produce for this use case.

@hmemcpy
Copy link

hmemcpy commented Jan 29, 2018

Hi @iirina, @chaoren,

I want to implement this using @kcooney's suggestion, supplying the Skipped element on tests that were actually ignored, assuming all others ran. For this to work on the IntelliJ side, a small change would be needed to remove the check for 'time'. (see referenced issues)

Do you think this is an acceptable solution?

@hmemcpy
Copy link

hmemcpy commented Jan 31, 2018

Ok, after further analyzing the situation, I think I misunderstood the initial intent. It seems, as @chaoren reported, that test.xml will contain all test cases, even those filtered out by --test_filter (happens in both java and scala/specs2 tests).
I've reconsidered, and I don't think that skipped is a good solution for this, because it changes the semantics - skipped should be used on tests that were @Ignored or some assumption about it failed. This doesn't seem to be supported in general, at the moment, because ignored tests don't even appear in the test model.

I think a much better solution would be to use check whether or not a test case is filtered, before adding it to the model. Test cases that were filtered out should not be added to the test model at all. WFYT?

@hmemcpy
Copy link

hmemcpy commented Jan 31, 2018

Ok, even further digging. The issue, seems to me, is with the TestSuiteModel, or, rather, with the Request that holds the runner. According to this comment, the Runner is being called twice, before and after filtering. The problem is, that once TestSuiteModel is created initially (before filtering), it is memozied, so later, when JUnit4TestXmlListener requests it, it will get the "unfiltered" model. This is why the output xml contains test cases that were filtered out!

Maybe there's a way to only memoize the TestSuiteModel after filtering?

@hmemcpy
Copy link

hmemcpy commented Jan 31, 2018

(last update, promise :))
I removed memoization from the creation of JUnit4TestModelBuilderFactory and its supplier, and this fixed the bug: the filtered out test name no longer appears:

bazel test --test_filter=com.example.FooBarTest#testBar //:FooBarTest

<?xml version='1.0' encoding='UTF-8'?>
<testsuites>
  <testsuite name='com.example.FooBarTest' timestamp='1970-01-02T02:23:21.548Z' hostname='localhost' tests='1' failures='0' errors='0' time='0.002' package='' id='0'>
    <properties />
    <testcase name='testBar' classname='com.example.FooBarTest' time='0.002' />
    <system-out />
    <system-err />
  </testsuite>
</testsuites>

The question is, how expensive is the test model creation, and can it be memoized only after applying the filters?

@kcooney
Copy link

kcooney commented Feb 1, 2018

@hmemcpy TestSuiteModel intentionally includes all tests, including the ones that were filtered (either via the command line or from the sharding process). We use this internally at Google to produce a different representation of the test results than the one you see in AntXmlResultWriter.

Having details about all of the tests can be very useful. For example, for a sharded test run you can combine the results from all shards to give a consolidated view of the test results from all shards, even though each shard has the result for some of the tests.

As for what to do with AntXmlResultWriter...

You can determine which tests are filtered by looking at the TestResult and calling getStatus() The status values are defined here.

My guess is the fix will be around here.

The challenge with this bug is that the Ant/Maven XML test format isn't very flexible, so can't represent all of the possible states that the Bazel JUnit runner supports. I don't know enough about the open source tools that process the XML to know what changes Bazel would need to make to the generated XML so that those tools can identify filtered tests. Maybe we just don't include them in the XML. Maybe we include them, but specify an attribute that indicates that the test was not run. It's hard to know which to do without understanding the tools that process the XML.

This is why I suggested looking at what XML Maven Surefire would produce for these kinds of use cases. If we could match what Surefire does we are more likely to work with tools that process the XML.

@hmemcpy
Copy link

hmemcpy commented Feb 1, 2018

Kevin, thanks for the detailed answer!

To be sure, I ran the FooBarTest example with Maven Surefire, using mvn surefire:test -Dtest=com.example.FooBarTest#testBar, and sure enough - there was only one testcase entry in the xml - so the maven xml contains only tests that were actually run. This suggests that the test.xml produced by Bazel can also do this, only include what actually ran.

Your suggestions about the possible fix sound reasonable - to actually fix this in the AntXmlResultWriter. I'll play around with it, thanks!

I was toying with the idea to allow providing an external xml producer (#4507) instead of the AntXml one, so that different runners could provide their own format, but as you mentioned - we don't know what other tools rely on the Bazel-flavored xml report...

@kcooney
Copy link

kcooney commented Feb 1, 2018

@hmemcpy sounds like you can update AntXmlResultWriter to not print out the testcase elements if the status is FILTERED. That will sometimes leave behind empty testsuite elements; not sure if you need to remove them from the output.

@hmemcpy
Copy link

hmemcpy commented Feb 1, 2018

@kcooney in that case, the fix seems trivial: add a check in writeTestCases to skip those whose status is FILTERED. I wonder if there's anything else I'm not considering here?

@kcooney
Copy link

kcooney commented Feb 1, 2018

@hmemcpy if all test cases in a test class are filtered out should we emit a <testsuite> element for the test class?

@hmemcpy
Copy link

hmemcpy commented Feb 1, 2018

Good question... I think so, since we still would like to show it as an element in the UI (intellij), I think, it just won't have any test cases...

@hmemcpy
Copy link

hmemcpy commented Mar 8, 2018

We found a problem with this patch... seems that tests that are added dynamically by the test runner (in our case, specs2 'examples' that are generated with Fragments.foreach) do not appear in the xml!
I'm investigating the reason (I suspect they are added to the model with the wrong initial state, and do not transition to STARTED).

In the meantime, and after a discussion with @ittaiz, I'd like to ask to revert this PR to prevent broken reports.
@philwo, I wonder if it's possible to revert this before the next version is released?

@kcooney
Copy link

kcooney commented Mar 8, 2018

As a maintainer of JUnit I'd like to point out that i am not sure if dynamically-created tests are officially supposed for JUnit4-style tests (by "dynamically-created" I mean tests that don't exist until the test starts running). It may happen to work, but there is no framework support for it, and any extensions using test listeners (including some built-in rules) may behave in odd ways.

JUnit5 does support it for Jupiter-style tests.

@hmemcpy did you verify that dynamically-generated tests were included in the XML before Bazel had this change (and had the correct status on pass/fail)? Could you perhaps write a test in the test suite for XML writing that verifies the regression so Bazel doesn't break this in the future? Can we fix forward instead of roll back?

@hmemcpy
Copy link

hmemcpy commented Mar 8, 2018

@kcooney hey, I may have described it wrong by calling them "dynamically-created", but yeah, before the check for FILTERED in the xml writer, all of them were added to the report. They are reported started in JUnit4TestXmlListener, but still they appear as filtered when test run completes... Still investigating.

@ittaiz
Copy link
Member

ittaiz commented Mar 8, 2018

@kcooney @iirina we're dedicated to resolving this correctly but we'd really appreciate rolling it back currently since we need to use 0.12.0-RC1 (which @philwo wants to cut today) and we won't be able to use it if this commit is in.
cc @helenalt

@kcooney
Copy link

kcooney commented Mar 8, 2018

FYI I am not on the Bazel team (I just worked on parts of the JUnit integration). Hopefully someone on the team will so a roll back.

@philwo
Copy link
Member

philwo commented Mar 8, 2018

@hmemcpy @ittaiz I'll send a rollback for this and make sure that it will be part of the next release.

@ittaiz
Copy link
Member

ittaiz commented Mar 8, 2018 via email

bazel-io pushed a commit that referenced this issue Mar 9, 2018
*** Reason for rollback ***

Rollback was requested by original authors @hmemcpy and @ittaiz in #3201:

"We found a problem with this patch... seems that tests that are added dynamically by the test runner (in our case, specs2 'examples' that are generated with Fragments.foreach) do not appear in the xml!"

This should be part of 0.12.0-rc1, otherwise that release will have the above mentioned regression.

*** Original change description ***

Skipping writing FILTERED tests to test.xml

This fixes #3201 by preventing tests that haven't actually run to be written to the test.xml. This is consistent with how e.g. surefire reports work, tests that were filtered out do not appear in the xml.
This allows changing the Bazel plugin in such a way that does not depend on `time` being 0.0.

Closes #4596.

PiperOrigin-RevId: 188455315
@hmemcpy
Copy link

hmemcpy commented Mar 19, 2018

After further analysis, I pinpointed the source of the problem -- it has to do with the value used for hashCode in the Description objects that are produced by specs2, in case of tests generated in a loop (e.g. Fragments.foreach API). This causes bazel to incorrectly build the testMap (due to the hashCode being the same, only one test actually appears in the model).

I've sent a PR to the author of specs2, so it should be fixed in future versions.

This PR is good to go as-is. Can you please reapply the commit, or should I send a new one? @philwo

@philwo
Copy link
Member

philwo commented Mar 19, 2018

@hmemcpy No problem, I can just revert the revert automatically.

Just to make sure: In your research you found out that the original code in this PR was fine (the bug was in a third-party library) and thus we don't have to cherrypick the revert into the 0.12.0 releaes anymore, right?

bazel-io pushed a commit that referenced this issue Mar 19, 2018
*** Reason for rollback ***

Rollforward as requested by the original author @hmemcpy.

*** Original change description ***

Skipping writing FILTERED tests to test.xml

This fixes #3201 by preventing tests that haven't actually run to be written to the test.xml. This is consistent with how e.g. surefire reports work, tests that were filtered out do not appear in the xml.
This allows changing the Bazel plugin in such a way that does not depend on `time` being 0.0.

Closes #4596.

PiperOrigin-RevId: 189610180
@hmemcpy
Copy link

hmemcpy commented Mar 20, 2018

@philwo yeah, the original PR was fine, we're good in having it in 0.12.0. Thanks again!

@kcooney
Copy link

kcooney commented Mar 20, 2018

I believe Jukito can also create multiple identical Description values for parameterized methods, so some users will see issues there. I wonder how other build tools handle it; from JUnit's perspective it can't tell which result comes from which test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue P2 We'll consider working on this in future. (Assignee optional) type: feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants