Unit tests to use a random port for the dashboard #5060

crusaderky · 2021-07-14T15:33:05Z

Drastically reduce the amount of warning messages, complaining that port 8787 is occupied, at the end of the test logs
Use gen_test / gen_cluster whenever possible
Clean up most of the other warnings captured by pytest

crusaderky · 2021-07-14T15:34:11Z

distributed/cli/tests/test_dask_scheduler.py

            c.sync(f)

        response = requests.get("http://127.0.0.1:8787/status/")
-        assert response.status_code == 404
+        response.raise_for_status()


404 is tested by test_no_dashboard below

crusaderky · 2021-07-14T15:39:29Z

distributed/comm/tests/test_ucx.py

@@ -4,12 +4,12 @@



This whole module is broken at least against the latest version of ucp. It needs bugfixing and CI integration.

cc @quasiben

cc also @pentschev

This whole module is broken at least against the latest version of ucp. It needs bugfixing and CI integration.

Could you provide more details? How are you installing UCX-Py and UCX, and how are you launching the tests? What errors do you see?

We run those tests in https://github.com/rapidsai/ucx-py/blob/branch-0.21/ci/gpu/build.sh#L111 and they're currently passing here where we test both UCX 1.9 and master branch.

As for CI integration, this is in the scope of dask/community#138 .

Apologies, I botched the packages installation. Should've RTFM.

No worries, thanks for the update.

From the UCX side I can confirm these changes still work, so I'm approving on behalf of UCX changes only and leave other changes for those with better understanding than myself to judge.

crusaderky · 2021-07-14T15:45:35Z

distributed/deploy/tests/test_local.py

        assert cluster.scheduler_address in text
-        assert "cores=4" in text or "threads=4" in text
-        assert "4.00 GB" in text or "3.73 GiB" in text
+        assert "workers=2, threads=2, memory=2.00 GiB" in text


Drop backwards compatibility with dask < 2021.04.0

crusaderky · 2021-07-14T15:48:18Z

distributed/tests/test_client.py

        with Client(s["address"], loop=loop) as c:
            for func in funcs:
                text = func(c)
                assert c.scheduler.address in text
                assert "threads=3" in text or "Total threads: </strong>" in text
-                assert "6.00 GB" in text or "5.59 GiB" in text
+                assert "6.00 GiB" in text


Drop backwards compatibility with dask < 2021.04.0

crusaderky · 2021-07-14T22:25:10Z

CI output of a successful Ubuntu 3.7 run:
Master: 8186 rows
This PR: 2466 rows 😎
cc @mrocklin @jrbourbeau

fjetter · 2021-07-15T08:59:57Z

You are removing a lot of pytest-asyncio markers. What's the reasoning here? Should we drop pytest-asyncio generally? If it is about the cleanup fixture, we could probably define it as an autouse fixture

crusaderky · 2021-07-15T09:16:48Z

You are removing a lot of pytest-asyncio markers. What's the reasoning here? Should we drop pytest-asyncio generally? If it is about the cleanup fixture, we could probably define it as an autouse fixture

I'm replacing them with gen_test, which features both cleanup and timeout. Unlike the timeout of pytest-asyncio, gen_test cleanly aborts a single test and can produce a stack trace.

Auto-adding a cleanup fixture is a problem because (1) I suspect it could interact poorly with gen_cluster - to be tested - and (2) cleanup fails for some tests. While ideally it is desirable to have all tests decorated by cleanup,being forced to fix the underlying issue all the times may prove to be unpractical.

fjetter · 2021-07-15T09:26:24Z

My question mostly points to "should we abandon pytest-asyncio or do we see a future". That's an interesting question for future code reviews, even if we still have remnants in the code base.

Regarding timeout in pytest-asyncio, see also pytest-dev/pytest-asyncio#216 which implements our gen_test timeout logic there directly

crusaderky · 2021-07-15T09:43:26Z

We should not use pytest-asyncio for as long as it doesn't implement a timeout, and when it does, we could consider cleaning up a bunch of in-house stuff from distributed.utils_test. There are challenges there; namely

have a blanket default timeout (with the option to override it) instead of having to specify it every time
asyncio event loop vs. tornado event loop
blanket cleanup
either have a way to selectively disable cleanup, or go through all cleanup issues of all tests (which may not be even possible)
downstream projects using distributed.utils_test

The notable exception is parametrized tests and pytest fixtures. gen_cluster supports them, but gen_test doesn't (yet), so pytest-asyncio there remains the only alternative for the time being.

crusaderky · 2021-07-15T19:53:09Z

Ready for review and merge

jrbourbeau

Thanks for the updates here @crusaderky. Generally they look good. I noticed there are seemingly more timeout-related test failures here than in other recent PRs. Is this something we should be concerned about?

crusaderky · 2021-07-16T11:17:04Z

@jrbourbeau of the 6 failures you're talking about, 3 were potentially caused by my PR, while the other 3 seem to be unrelated.
I've taken the occasion to revisit the slow markers and the timeouts for all the tests that take >15s on CI.

jacobtomlinson

That test output looks great! So much less scrolling.

mrocklin · 2021-07-16T14:11:32Z

I think that we should consider reverting this after #4928 gets in If I recall correctly we used to do random ports, but then we had lots of hard to explain random failures that came about because a scheduler/worker from a previous test managed to connect back to the scheduler/worker of a more recent test. This was hard to understand because of the randomness. We then started to use the same ports everywhere so that this problem was more obvious and this then led us to the various cleanup testing infrastructure. In general, my experience is that avoiding/working around problems in the distributed CI works, but sometimes leads to hard to track down situations. Randomness is generally something that I think we want to avoid.

…

On Fri, Jul 16, 2021 at 7:26 AM Jacob Tomlinson ***@***.***> wrote: ***@***.**** commented on this pull request. That test output looks great! So much less scrolling. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5060 (review)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTGU3SWCMGJVQALDOMLTYAQOHANCNFSM5ALWR2PQ> .

crusaderky · 2021-07-16T14:23:35Z

@mrocklin I assume that with "revert" you're just referring to the instances to dashboard_address=":0" throughout the test modules. This PR does a wealth of cleanup on top of that.

mrocklin · 2021-07-16T14:25:15Z

Yes, sorry, that's all I meant. I don't think that we should rely on random ports to hide any weirdness that our tests may generate. We should jump into and swim in that weirdness until it is fixed. I'm hopeful that the linked PR fixes that particular issue.

…

On Fri, Jul 16, 2021 at 9:23 AM crusaderky ***@***.***> wrote: @mrocklin <https://github.com/mrocklin> I assume that with "revert" you're just referring to the instances to dashboard_address=":0" throughout the test modules. This PR does a wealth of cleanup on top of that. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5060 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AACKZTBSUSF3QQDPB676ILLTYA6HDANCNFSM5ALWR2PQ> .

crusaderky · 2021-07-22T14:22:48Z

@jrbourbeau is there anything outstanding for this?

fjetter

FWIW I had a brief discussion with @jrbourbeau yesterday. My outstanding PR #4928 is unfortunately still not ready to fix all port warning. It should not block this PR, especially not since there are other valuable cleanups.

I'll merge once the CI jobs are through

crusaderky added 3 commits July 14, 2021 14:33

tests to bind on port 0 1/2

680f132

Merge remote-tracking branch 'upstream/main' into dashboard_address

7732dc5

tests to bind on port 0 2/2

3644a10

crusaderky commented Jul 14, 2021

View reviewed changes

Self-review

a8ae24f

crusaderky force-pushed the dashboard_address branch from 0b2ced7 to a8ae24f Compare July 14, 2021 19:28

simplify test

747bd36

Merge branch 'main' into dashboard_address

781a5e8

crusaderky closed this Jul 15, 2021

crusaderky reopened this Jul 15, 2021

pentschev approved these changes Jul 15, 2021

View reviewed changes

crusaderky added 3 commits July 15, 2021 16:37

Stability enhancements

1542a87

Merge remote-tracking branch 'upstream/main' into dashboard_address

c8e8190

Merge branch 'main' into dashboard_address

1a15a2e

crusaderky marked this pull request as ready for review July 15, 2021 19:52

jrbourbeau reviewed Jul 16, 2021

View reviewed changes

crusaderky added 2 commits July 16, 2021 12:15

Revisit timeouts and slow markers

7bb77e5

Merge branch 'main' into dashboard_address

9cb8101

jacobtomlinson reviewed Jul 16, 2021

View reviewed changes

crusaderky added 6 commits July 22, 2021 15:23

Merge branch 'main' into dashboard_address

ebbdf82

Merge branch 'main' into dashboard_address

9502cf3

test_cancellation

42cefb4

Merge branch 'main' into dashboard_address

543c62e

relax timings

06cb0c7

Remove rendundant port=0

bdd56df

fjetter approved these changes Jul 28, 2021

View reviewed changes

fjetter merged commit 33b795f into dask:main Jul 28, 2021

crusaderky deleted the dashboard_address branch July 28, 2021 13:10

crusaderky mentioned this pull request Feb 24, 2022

Flaky test_worker_waits_for_scheduler #5861

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unit tests to use a random port for the dashboard #5060

Unit tests to use a random port for the dashboard #5060

crusaderky commented Jul 14, 2021 •

edited

Loading

crusaderky Jul 14, 2021

crusaderky Jul 14, 2021

mrocklin Jul 14, 2021

mrocklin Jul 14, 2021

pentschev Jul 15, 2021

crusaderky Jul 15, 2021

pentschev Jul 15, 2021

crusaderky Jul 14, 2021 •

edited

Loading

crusaderky Jul 14, 2021 •

edited

Loading

crusaderky commented Jul 14, 2021

fjetter commented Jul 15, 2021

crusaderky commented Jul 15, 2021

fjetter commented Jul 15, 2021

crusaderky commented Jul 15, 2021 •

edited

Loading

crusaderky commented Jul 15, 2021

jrbourbeau left a comment

crusaderky commented Jul 16, 2021 •

edited

Loading

jacobtomlinson left a comment

mrocklin commented Jul 16, 2021 via email

crusaderky commented Jul 16, 2021

mrocklin commented Jul 16, 2021 via email

crusaderky commented Jul 22, 2021

fjetter left a comment

Unit tests to use a random port for the dashboard #5060

Unit tests to use a random port for the dashboard #5060

Conversation

crusaderky commented Jul 14, 2021 • edited Loading

crusaderky Jul 14, 2021

Choose a reason for hiding this comment

crusaderky Jul 14, 2021

Choose a reason for hiding this comment

mrocklin Jul 14, 2021

Choose a reason for hiding this comment

mrocklin Jul 14, 2021

Choose a reason for hiding this comment

pentschev Jul 15, 2021

Choose a reason for hiding this comment

crusaderky Jul 15, 2021

Choose a reason for hiding this comment

pentschev Jul 15, 2021

Choose a reason for hiding this comment

crusaderky Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

crusaderky Jul 14, 2021 • edited Loading

Choose a reason for hiding this comment

crusaderky commented Jul 14, 2021

fjetter commented Jul 15, 2021

crusaderky commented Jul 15, 2021

fjetter commented Jul 15, 2021

crusaderky commented Jul 15, 2021 • edited Loading

crusaderky commented Jul 15, 2021

jrbourbeau left a comment

Choose a reason for hiding this comment

crusaderky commented Jul 16, 2021 • edited Loading

jacobtomlinson left a comment

Choose a reason for hiding this comment

mrocklin commented Jul 16, 2021 via email

crusaderky commented Jul 16, 2021

mrocklin commented Jul 16, 2021 via email

crusaderky commented Jul 22, 2021

fjetter left a comment

Choose a reason for hiding this comment

crusaderky commented Jul 14, 2021 •

edited

Loading

crusaderky Jul 14, 2021 •

edited

Loading

crusaderky Jul 14, 2021 •

edited

Loading

crusaderky commented Jul 15, 2021 •

edited

Loading

crusaderky commented Jul 16, 2021 •

edited

Loading