Timeout functional tests #449

kgaughan · 2024-10-28T22:30:10Z

If the functional tests don't exist after five seconds of waiting on them, they should be killed. This also uses pytest-timeout to impose a maximum runtime of 30secs on any test in the test suite.

Hopefully this should prevent tests from hanging.

If the functional tests don't exist after five seconds of waiting on them, they should be killed. This also uses pytest-timeout to impose a maximum runtime of 30secs on any test in the test suite. Hopefully this should prevent tests from hanging.

…timed out

kgaughan · 2024-10-28T23:00:23Z

Closer. The functional tests look to be completing, but pytest (or tox) looks to get stuck, so the final step in the workflow doesn't exit. My best guess is that there's a wait or join being executed in the atexit handler that's preventing the process from exiting.

kgaughan · 2024-10-28T23:12:19Z

I just reran the failed workflows, and they succeeded. I guess it's at least better that the tests themselves aren't hanging and, but I'm not at all sure why the process sometimes doesn't exit like it should

digitalresistor · 2024-10-28T23:59:24Z

Not a fan of this change. I feel like it papers over waitress not existing cleanly. I want tests to hang.

The GitHub actions runners have been slow and having issues for the last couple of days, locally these tests always succeed correctly.

kgaughan · 2024-10-29T01:20:39Z

In that case, I can drop the changes to tests/test_functional.py and keep the timeout on the test: it should fail on the offending test while it's waiting on the join() and wait() calls to return. Sound good?

digitalresistor · 2024-10-29T01:38:04Z

What do we gain? I am not sure I fully understand what the point of it is, today with the current test suite. The additional dependency/package is an additional one to download/install, across all CI runs. While it may be small it adds up, and I try to be cognizant of the overhead of adding yet another thing that can cause issues because it is not maintained/not updated as quickly as other parts of the ecosystem.

If the functional tests hanging was a bigger issue and one that could be triggered regularly I wouldn't be opposed to adding extra security against it.

kgaughan · 2024-10-29T01:45:56Z

What I was originally trying to do was to see if it was reveal which of the functional tests was the most problematic when it came to hanging in the hopes of making the test suite more reliable. They seemed to most consistently fail on EchoTests and TooLargeTests with the timeout in place.

If you don't see it as worthwhile, I can close the PR though.

digitalresistor · 2024-10-29T02:04:11Z

Where do you see them hanging? I am trying to understand better because I too want the test suite to not fail/and understand the issues and fix them.

While working on some stuff lately to fix some of the security issues I have re-ran the test suite 10's of times, I haven't seen it hang at all, and my test suite runs have been ~14 seconds (with coverage enabled, couple seconds shorter without coverage).

digitalresistor · 2024-10-29T02:09:52Z

I will note that there seems to be an issue with the Github actions, the test suite there historically has ran in about ~60 seconds at most, and now I am seeing runs that are a couple of minutes a piece, or even longer.

I don't know how to account for those changes in runtime, it's the same code base, same tests.

kgaughan · 2024-10-29T10:10:50Z

Yep, it confused me too. It hangs for me randomly from time to time in the middle of the functional tests. I couldn't detect a pattern to it, hence why I tried cleaning up the fixture subprocesses more aggressively.

simonk52 · 2024-11-15T10:44:39Z

It hangs for me randomly from time to time in the middle of the functional tests

Just to add an additional data point, I see the same thing. I'm on Ubuntu 24.04, running tox with pythons 3.12 and 3.13, with waitress at commit 23ac524.

I just ran tox in a loop and it completed successfully the first 2 times but hung on the third attempt, during test_functional.py. I hit Ctrl-C after a few minutes:

platform linux -- Python 3.12.3, pytest-8.3.3, pluggy-1.5.0
cachedir: .tox/py312/.pytest_cache
rootdir: /home/nfhd78/Downloads/waitress
configfile: setup.cfg
testpaths: tests
plugins: cov-6.0.0
collected 797 items                                                                                                                                                                                                                                                      

tests/test_adjustments.py .................................................                                                                                                                                                                                        [  6%]
tests/test_buffers.py ....................................................                                                                                                                                                                                         [ 12%]
tests/test_channel.py ..........................................................................................................................                                                                                                                   [ 27%]
tests/test_functional.py .....................................................................^CROOT: [1997110] KeyboardInterrupt - teardown started
ROOT: interrupt tox environment: py312
ROOT: requested interrupt of 1997666 from 1997110, activate in 0.00
ROOT: send signal SIGINT(2) to 1997666 from 1997110 with timeout 0.30
ROOT: send signal SIGTERM(15) to 1997666 from 1997110 with timeout 0.20
ROOT: interrupt finished with success
.pkg: interrupt tox environment: .pkg
py312: exit -15 (726.06 seconds) /home/nfhd78/Downloads/waitress> python -mpytest pid=1997666
  lint: OK (13.31=setup[0.04]+cmd[0.28,1.88,5.35,5.27,0.49] seconds)
  py39: SKIP (0.06 seconds)
  py310: SKIP (0.05 seconds)
  py311: SKIP (0.05 seconds)
  py312: FAIL code -15 (731.47=setup[5.41]+cmd[0.00,726.06] seconds)
  py313: FAIL code -3 (0.01 seconds)
  pypy39: FAIL code -3 (0.01 seconds)
  pypy310: FAIL code -3 (0.01 seconds)
  coverage: FAIL code -3 (0.01 seconds)
  docs: FAIL code -3 (0.01 seconds)
  evaluation failed :( (745.03 seconds)

The fourth run hung in a similar way, although after a different number of tests within test_functional.py (so I don't think it's the same test hanging each time)

platform linux -- Python 3.12.3, pytest-8.3.3, pluggy-1.5.0
cachedir: .tox/py312/.pytest_cache
rootdir: /home/nfhd78/Downloads/waitress
configfile: setup.cfg
testpaths: tests
plugins: cov-6.0.0
collected 797 items                                                                                                                                                                                                                                                      

tests/test_adjustments.py .................................................                                                                                                                                                                                        [  6%]
tests/test_buffers.py ....................................................                                                                                                                                                                                         [ 12%]
tests/test_channel.py ..........................................................................................................................                                                                                                                   [ 27%]
tests/test_functional.py ...................................................................................^CROOT: [1999983] KeyboardInterrupt - teardown started
ROOT: interrupt tox environment: py312
ROOT: requested interrupt of 2000537 from 1999983, activate in 0.00
ROOT: send signal SIGINT(2) to 2000537 from 1999983 with timeout 0.30
/home/nfhd78/Downloads/waitress/.tox/py312/lib/python3.12/site-packages/_pytest/main.py:337: PluggyTeardownRaisedWarning: A plugin raised an exception during an old-style hookwrapper teardown.
Plugin: _cov, Hook: pytest_runtestloop
KeyboardInterrupt: 
For more information see https://pluggy.readthedocs.io/en/stable/api_reference.html#pluggy.PluggyTeardownRaisedWarning
  config.hook.pytest_runtestloop(session=session)
Process ForkProcess-84:
Traceback (most recent call last):
ROOT: send signal SIGTERM(15) to 2000537 from 1999983 with timeout 0.20
ROOT: interrupt finished with success
.pkg: interrupt tox environment: .pkg
py312: exit -15 (338.49 seconds) /home/nfhd78/Downloads/waitress> python -mpytest pid=2000537
  lint: OK (13.41=setup[0.04]+cmd[0.25,1.82,5.31,5.49,0.49] seconds)
  py39: SKIP (0.05 seconds)
  py310: SKIP (0.05 seconds)
  py311: SKIP (0.07 seconds)
  py312: FAIL code -15 (343.95=setup[5.45]+cmd[0.00,338.49] seconds)
  py313: FAIL code -3 (0.01 seconds)
  pypy39: FAIL code -3 (0.01 seconds)
  pypy310: FAIL code -3 (0.01 seconds)
  coverage: FAIL code -3 (0.01 seconds)
  docs: FAIL code -3 (0.01 seconds)
  evaluation failed :( (357.65 seconds)

In total I ran tox 7 times, and it hung on 5 of those runs.

kgaughan · 2024-11-15T11:13:05Z

@simonk52 Delta's doing some testing elsewhere that 🤞 might render this PR mostly moot.

digitalresistor · 2024-11-15T20:15:09Z

It's due to a hack that we have had in test_functional.py which adds a signal handler for SIGTERM, which breaks on newer versions of coverage. See #455

#454 contains some test runs I did with various coverage versions. The change was introduced in 7.5.4 which also enabled running on Py3.13 without the GIL.

digitalresistor · 2024-11-16T19:39:26Z

I've merged the removal of the hack, I am closing this PR.

kgaughan · 2024-11-16T19:44:36Z

Thanks! Nice that the mystery is solved!

Timeout functional tests

79ce1df

If the functional tests don't exist after five seconds of waiting on them, they should be killed. This also uses pytest-timeout to impose a maximum runtime of 30secs on any test in the test suite. Hopefully this should prevent tests from hanging.

kgaughan mentioned this pull request Oct 28, 2024

Lean on pkgutil.resolve_name for importing apps #446

Merged

Only close the process if it's no longer alive: terminate if joining …

ba83320

…timed out

Let the functional tests hang rather than killing off the subprocess

1b6bbcd

Merge branch 'Pylons:main' into timeout-functional-tests

672fb19

digitalresistor closed this Nov 16, 2024

kgaughan deleted the timeout-functional-tests branch November 16, 2024 19:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Timeout functional tests #449

Timeout functional tests #449

kgaughan commented Oct 28, 2024

kgaughan commented Oct 28, 2024

kgaughan commented Oct 28, 2024

digitalresistor commented Oct 28, 2024

kgaughan commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

kgaughan commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

kgaughan commented Oct 29, 2024

simonk52 commented Nov 15, 2024

kgaughan commented Nov 15, 2024

digitalresistor commented Nov 15, 2024

digitalresistor commented Nov 16, 2024

kgaughan commented Nov 16, 2024

Timeout functional tests #449

Timeout functional tests #449

Conversation

kgaughan commented Oct 28, 2024

kgaughan commented Oct 28, 2024

kgaughan commented Oct 28, 2024

digitalresistor commented Oct 28, 2024

kgaughan commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

kgaughan commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

digitalresistor commented Oct 29, 2024

kgaughan commented Oct 29, 2024

simonk52 commented Nov 15, 2024

kgaughan commented Nov 15, 2024

digitalresistor commented Nov 15, 2024

digitalresistor commented Nov 16, 2024

kgaughan commented Nov 16, 2024