Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-110722: Make -m test -T -j use sys.monitoring #111710

Merged
merged 7 commits into from
Nov 10, 2023

Conversation

ambv
Copy link
Contributor

@ambv ambv commented Nov 3, 2023

Now all results from worker processes are aggregated and displayed together as a summary at the end of a regrtest run. Below is a non-scientific benchmark of how much faster this is compared to the old way of doing things.

The traditional trace is left in place for use with sequential in-process test runs but now raises a warning that those numbers are not precise.

-T -j requires --with-pydebug as it relies on -Xpresite=.

Without coverage

== Tests result: SUCCESS ==

10 slowest tests:
- test_configparser: 1.4 sec

1 test OK.

With old trace.py-based coverage

== Tests result: SUCCESS ==

10 slowest tests:
- test_configparser: 16.0 sec

1 test OK.

With new -T -j1 coverage based on sys.monitoring

== Tests result: SUCCESS ==

10 slowest tests:
- test_configparser: 2.3 sec

1 test OK.

Final caveat

Not all tests in the suite currently play well with coverage. When I'm running with -Uall,extralargefile on an M1 Mac, I get:

14 tests failed:
    test.test_asyncio.test_tasks test_bz2 test_capi test_curses
    test_descr test_dis test_frame test_gc test_lzma test_monitoring
    test_opcache test_tracemalloc test_types test_zlib

429 tests OK.

This was true also with the old trace-based coverage gathering but it was so slow that probably nobody bothered to run coverage over the entire suite.

I will look into fixing some of those incompatibilities because most of them are about reference counts, like:

test test_lzma failed -- Traceback (most recent call last):
  File "/Volumes/RAMDisk/cpython/Lib/test/support/__init__.py", line 1093, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/RAMDisk/cpython/Lib/test/test_lzma.py", line 381, in test_refleaks_in_decompressor___init__
    self.assertAlmostEqual(gettotalrefcount() - refs_before, 0, delta=10)
AssertionError: 12 != 0 within 10 delta (12 difference)

📚 Documentation preview 📚: https://cpython-previews--111710.org.readthedocs.build/

@ambv ambv changed the title Make -m test -T -j use sys.monitoring gh-110722: Make -m test -T -j use sys.monitoring Nov 3, 2023
@ambv
Copy link
Contributor Author

ambv commented Nov 6, 2023

My hunch that nobody used to run a full suite with coverage seems to be corroborated by it not working, apparently. I was unable to get a full run of python -m test -T to complete on any version of Python between 3.8 and 3.13 on either Linux aarch64 or macOS M1.

So instead, I focused on bringing down the number of tests failing under coverage. With the latest change, tests already marked with @no_tracing are now temporarily disabling sys.monitoring events as well. This leaves 10 failing tests, all of which also failed with coverage before this PR. I will be fixing those in separate PRs because that's not related to the change in this PR.

10 tests failed:
    test_capi test_curses test_descr test_dis test_frame test_gc
    test_monitoring test_opcache test_tracemalloc test_types

In any case, unless anybody's got any strong opinions on this, this is ready to land.

@vstinner
Copy link
Member

I would prefer to also use test.cov when tests are run sequentially. Here is a patch to support it as well. It uses the new libregrtest code to "add options to Python":

diff --git a/Lib/test/libregrtest/cmdline.py b/Lib/test/libregrtest/cmdline.py
index a5f02d6335..cf3943dd5d 100644
--- a/Lib/test/libregrtest/cmdline.py
+++ b/Lib/test/libregrtest/cmdline.py
@@ -426,8 +426,6 @@ def _parse_args(args, **kwargs):
             ns.rerun = True
         ns.print_slow = True
         ns.verbose3 = True
-    else:
-        ns._add_python_opts = False
 
     # When both --slow-ci and --fast-ci options are present,
     # --slow-ci has the priority
@@ -448,16 +446,8 @@ def _parse_args(args, **kwargs):
 
     if ns.single and ns.fromfile:
         parser.error("-s and -f don't go together!")
-    if ns.trace:
-        if ns.use_mp is not None:
-            if not Py_DEBUG:
-                parser.error("need --with-pydebug to use -T and -j together")
-        else:
-            print(
-                "Warning: collecting coverage without -j is imprecise. Configure"
-                " --with-pydebug and run -m test -T -j for best results.",
-                file=sys.stderr
-            )
+    if ns.trace and not Py_DEBUG:
+        parser.error("need --with-pydebug to use -T")
     if ns.python is not None:
         if ns.use_mp is None:
             parser.error("-p requires -j!")
diff --git a/Lib/test/libregrtest/main.py b/Lib/test/libregrtest/main.py
index 296e1b4631..2a16969814 100644
--- a/Lib/test/libregrtest/main.py
+++ b/Lib/test/libregrtest/main.py
@@ -605,6 +605,8 @@ def _add_python_opts(self):
         environ, keep_environ = self._add_cross_compile_opts(regrtest_opts)
         if self.ci_mode:
             self._add_ci_python_opts(python_opts, keep_environ)
+        if self.coverage:
+            python_opts.extend(('-X', 'test.cov'))
 
         if (not python_opts) and (not regrtest_opts) and (environ is None):
             # Nothing changed: nothing to do

Copy link
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe test.coverage name would be more explicit than test.cov?

I would prefer to use the same coverage implementation in all cases.

Would it be possible to make presite feature available to release builds?

@vstinner
Copy link
Member

This leaves 10 failing tests, all of which also failed with coverage before this PR. I will be fixing those in separate PRs because that's not related to the change in this PR.

I suppose that a function can be added to test.support to check if code coverage is enabled, so we can skip lines of these tests which are impacted by coverage. Extract of test_descr:

            new_objects = len(gc.get_objects())
            self.assertEqual(orig_objects, new_objects)

Well, if a tracing tool runs in the background, such test can fail. So yeah, we should skip these lines, or just the whole test method.

@ambv
Copy link
Contributor Author

ambv commented Nov 10, 2023

Thanks for the review!

I'd prefer to change sequential -T runs to sys.monitoring in a future update because it's not as simple as your quick-and-dirty patch. By the time you add opts in _add_python_opts, libregrtest already imported a ton of libraries, the data is already ruined. Running -T sequentially with sys.monitoring will require -Xpresite= to be passed manually by the user, which I would prefer people don't have to do. So we would have to change sequential tests to execve or something with -Xpresite=.

Then there's the complication that -Xpresite is only available for Py_DEBUG builds because there were concerns this new hook could be misused by users. We can revisit adding it to production builds in the future but I don't want to block this PR on that discussion.

@vstinner
Copy link
Member

By the time you add opts in _add_python_opts, libregrtest already imported a ton of libraries, the data is already ruined.

This function replaces the current process with a new process which has the expected Python options: see _execute_python() method. On Unix, it uses the magic os.execv() function.

@vstinner
Copy link
Member

Then there's the complication that -Xpresite is only available for Py_DEBUG builds because there were concerns this new hook could be misused by users. We can revisit adding it to production builds in the future but I don't want to block this PR on that discussion.

It makes sense, ok. I don't want to block the PR, I just mean that "it would be nice" to use test.cov in all cases.

@ambv ambv merged commit 3932b0f into python:main Nov 10, 2023
@ambv ambv deleted the gh-110722-regrtest branch November 10, 2023 17:17
@ambv
Copy link
Contributor Author

ambv commented Nov 10, 2023

Thanks for reviews, Victor and Serhiy! ✨ 🍰 ✨

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…1710)

Now all results from worker processes are aggregated and
displayed together as a summary at the end of a regrtest run.

The traditional trace is left in place for use with sequential
in-process test runs but now raises a warning that those
numbers are not precise.

`-T -j` requires `--with-pydebug` as it relies on `-Xpresite=`.
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…1710)

Now all results from worker processes are aggregated and
displayed together as a summary at the end of a regrtest run.

The traditional trace is left in place for use with sequential
in-process test runs but now raises a warning that those
numbers are not precise.

`-T -j` requires `--with-pydebug` as it relies on `-Xpresite=`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants