-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge performance hit & termination due to high memory usage when using the JUnit reporter #2897
Comments
I'm currently editing as continuing to extend my tests just resulted in CI failures due to killed processes. |
Discounting any other allocations, that would come out to about 4.6kB per assertion, which is large, but not impossibly so, especially since the peak memory usage is going to be when the underlying storage vector is reallocated, at which point the memory usage is up to 3x the quiescent state (1x existing storage, 2x the new storage). The issue with JUnit format is that it cannot be written as the results are processed, but has to be written after all the tests are ran... and to do that, it has to store all required metadata. This means storing the metadata of every assertion that happened -> looking at how the CumulativeReporterBase stores the data, it needs 480B per assertion if no further data are stored there. Thankfully it already skips expanding the assertion message because JUnit reporter does not ask for message in passing assertions. After looking into it further, it might be possible to have JUnit skip storing passing assertion, as it currently does not do anything with them, even when the user specifies If someone feels like making that optimization they are free to. But for your personal use case, you should not be having that many assertions to compare two vectors; use matchers. |
Given `TEST_CASE` that just runs many passing assertions, e.g. ``` TEST_CASE( "PERFORMANCE_TEST_CASE", "[.]" ) { for (size_t i = 0; i < 1000'000'000; ++i) { REQUIRE( i == i ); } } ``` the new JUnit reporter will finish in 5:47, using up ~7.7 MB of RAM. The old JUnit reporter would finish in 0:30, due to bad_alloc after using up 14.5 GB of RAM (the system has 16 GB total). If the total number of assertions is lowered to 10M, the old implementation uses up ~ 7.1 GB of RAM and finishes in 12 minutes. The new implementation still needs ~7.7 MB of RAM, and finishes in 4 minutes. There is a slight downside in that the output is slightly different; the new implementation will include the `TEST_CASE` level section in output, even if it does not have its own assertion. In other words, given a `TEST_CASE` like this ``` TEST_CASE( "JsonWriter", "[JSON][JsonWriter]" ) { std::stringstream stream; SECTION( "Newly constructed JsonWriter does nothing" ) { Catch::JsonValueWriter writer{ stream }; REQUIRE( stream.str() == "" ); } SECTION( "Calling writeObject will create an empty pair of braces" ) { { auto writer = Catch::JsonValueWriter{ stream }.writeObject(); } REQUIRE( stream.str() == "{\n}" ); } } ``` the new implementation will output 3 `testcase` tags, 2 for the explicit `SECTION`s with tests, and 1 for the top level section. However, this can be worked-around if required, and the performance improvement is such that it is worth changing the current output, even if it needs to be fixed in the future. Closes #2897
Hey, thanks for your reply and the general effort with Catch2. The change you merged fixes the problem. Did some more digging and now understood how the matching of vectors works. Hadn't understood that in the first run. So it was faster with the patch but now I corrected the initial mistake too. |
Describe the bug
I recently upgraded my C++ projects unit tests by adding a JUnit reporter to use the resulting XML reports in gitlab for test result reports. While the gitlab part works well, I discovered, that adding the JUnit reporter increases the runtime of some of the tests in a really massive way. What normally takes < 10s including all the overhead of managing my environment in the gitlab executor takes - depending on the platform - up to around 4 minutes. The only change being the additional
--reporter "JUnit::out=/tmp/catch2-xml-report"
.Edit:
Additionally, the memory usage also explodes as I just found out. My CI test process, as I was extending my test cases, just "exploded". Meaning it was killed due to memory usage. I found hints to use GNU time to measure not only time but memory information. From what I got the interesting line is the following, first without the JUnit reporter, second with (this is from my actual project):
For the sample project:
Expected behavior
The test running with only minimal overhead. At least for the success case. In the end the report only contains more or less the same information as the command line (plus the name of each test & its runtime, but that shouldn't be too costly, right?). The file (in the success case) is just 448 characters.
Reproduction steps
I build a small sample project showing the problem. It uses conan to get Catch2 but I guess it could easily be included otherwise, conan is just the way I'm used to now: https://github.com/irieger/cpp-playground/blob/main/catch2-speed-sample/
The relevant file is
catch2-speed-sample/src/catch2_many_assertions.cpp
. It "simulates" a common use case - at least for me - where I have a large vector holding image data comparing an input processed through an algorithm with a reference, often with an accepted difference.This results in quite some millions of assertions. In this case I simulate a 30 Megapixel RGB image. Normally I have much smaller tests but also quite a few of those resulting in some million assertions in sum too.
Running this either directly (or with manually set console output) vs. a JUnit reporter shows the difference. Here is the output of running 10 times for each showing that the timing is overall consistent.
Console only, very consistent ~1.2s overall time for execution per run. Output piped to /dev/null as there intentionally are some assertions failing here:
Run with JUnit reporter, time per execution ~24.4s:
If I change it to succeed, it takes ~1.1s with the console output & ~18s
Platform information:
The measurements and system information above are from my workstation. In my CI I can reproduce the behaviour for Linux, Mac (for arm64 native and x64 cross-compiled run in Rosetta 2) and Windows, although the performance hit isn't exactly the same, in all cases the reporter is several times slower. In the end it is roughly 10-30 times slower.
Additional context
The reporter is a really nice addition and I was happy to find it. But blocking a CI runners for 240s instead of 10s is really a huge hit. Hope there is a way to prevent that.
Maybe an alternative if that is performance relevant could be to have a flag to only report the number of successes/faild in the XML, not report each with a line in the XML? Maybe this is the overhead?
The text was updated successfully, but these errors were encountered: