-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI: Python 3.10 on ubuntu-latest issue #851
Comments
* use Ubuntu 22.04 LTS for Python 3.10 testing to avoid mysterious issues described in gh-851
Ah, I can reproduce locally now when I use
|
To clarify, I could only reproduce on the current version of the feature branch in gh-839. Because this is quite annoying (hang with no feedback), I decided to
--- a/darshan-util/pydarshan/darshan/backend/cffi_backend.py
+++ b/darshan-util/pydarshan/darshan/backend/cffi_backend.py
@@ -732,7 +732,16 @@ def log_get_bytes_bandwidth(log_path: str, mod_name: str) -> str:
# in the old perl-based summary reports
darshan_derived_metrics = log_get_derived_metrics(log_path=log_path,
mod_name=mod_name)
- total_mib = darshan_derived_metrics.total_bytes / 2 ** 20
+ if mod_name == "MPI-IO":
+ # for whatever reason, this seems to require
+ # total_bytes reported from POSIX to match the
+ # old perl summary reports
+ darshan_derived_metrics_posix = log_get_derived_metrics(log_path=log_path,
+ mod_name="POSIX")
+ total_mib = darshan_derived_metrics_posix.total_bytes / 2 ** 20
+ else:
+ total_mib = darshan_derived_metrics.total_bytes / 2 ** 20
+
total_bw = darshan_derived_metrics.agg_perf_by_slowest
ret_str = f"I/O performance estimate (at the {mod_name} layer): transferred {total_mib:.1f} MiB at {total_bw:.2f} MiB/s"
return ret_str
diff --git a/darshan-util/pydarshan/darshan/tests/test_cffi_misc.py b/darshan-util/pydarshan/darshan/tests/test_cffi_misc.py
index 86bcbf8c..92060da4 100644
--- a/darshan-util/pydarshan/darshan/tests/test_cffi_misc.py
+++ b/darshan-util/pydarshan/darshan/tests/test_cffi_misc.py
@@ -167,6 +167,9 @@ def test_log_get_generic_record(dtype):
("imbalanced-io.darshan",
"STDIO",
"I/O performance estimate (at the STDIO layer): transferred 1.1 MiB at 0.01 MiB/s"),
+ ("imbalanced-io.darshan",
+ "MPI-IO",
+ "I/O performance estimate (at the MPI-IO layer): transferred 101785.8 MiB at 101.58 MiB/s"),
("laytonjb_test1_id28730_6-7-43012-2131301613401632697_1.darshan",
"STDIO",
"I/O performance estimate (at the STDIO layer): transferred 0.0 MiB at 4.22 MiB/s"),
@@ -176,6 +179,18 @@ def test_log_get_generic_record(dtype):
("treddy_mpi-io-test_id4373053_6-2-60198-9815401321915095332_1.darshan",
"STDIO",
"I/O performance estimate (at the STDIO layer): transferred 0.0 MiB at 16.47 MiB/s"),
+ ("e3sm_io_heatmap_only.darshan",
+ "STDIO",
+ "I/O performance estimate (at the STDIO layer): transferred 0.0 MiB at 3.26 MiB/s"),
+ ("e3sm_io_heatmap_only.darshan",
+ "MPI-IO",
+ "I/O performance estimate (at the MPI-IO layer): transferred 290574.1 MiB at 105.69 MiB/s"),
+ ("partial_data_stdio.darshan",
+ "MPI-IO",
+ "I/O performance estimate (at the MPI-IO layer): transferred 32.0 MiB at 2317.98 MiB/s"),
+ ("partial_data_stdio.darshan",
+ "STDIO",
+ "I/O performance estimate (at the STDIO layer): transferred 16336.0 MiB at 2999.14 MiB/s"),
])
def test_derived_metrics_bytes_and_bandwidth(log_path, mod_name, expected_str):
# test the basic scenario of retrieving
|
Maybe related to two files handles and/or two calls to |
* the testsuite now always uses `DarshanReport` with a context manager to avoid shenanigans with `__del__` and garbage collection/`pytest`/multiple threads * this appears to fix the problem with testsuite hangs described in darshan-hpcgh-839 and darshan-hpcgh-851
Fixes darshan-hpc#851 * the testsuite now always uses `DarshanReport` with a context manager to avoid shenanigans with `__del__` and garbage collection/`pytest`/multiple threads * this appears to fix the problem with testsuite hangs described in darshan-hpcgh-839 and darshan-hpcgh-851; I pushed this commit into darshan-hpcgh-839 recently so if the CI there stops hanging with `3.10` on top of my local confirmation, hopefully we're good to go on this annoyance * if the fix is confirmed by the CI over there, I do suggest we encourage the use of `DarshanReport` with a context manager in our documentation--perhaps we could open an issue for doing that and maybe looking for cases in our source (beyond the tests) where we may also consider the switchover
* the testsuite now always uses `DarshanReport` with a context manager to avoid shenanigans with `__del__` and garbage collection/`pytest`/multiple threads * this appears to fix the problem with testsuite hangs described in darshan-hpcgh-839 and darshan-hpcgh-851
* the Github Actions infrastructure is progressively phasing in a newer base image of Ubuntu: https://github.blog/changelog/2022-11-09-github-actions-ubuntu-latest-workflows-will-use-ubuntu-22-04/ * that means that we are somewhat randomly going to see images that lack Python `3.6` from the build cache sometimes per: actions/setup-python#355 (comment) * instead of pinning to an old version of Ubuntu in GHA for 3.6 support, let's just drop 3.6 from the testing matrix per: darshan-hpc#510 (comment) (it has been EOL for 1 year) * there's also no reason to retain the special Python `3.10` handling where we use Ubuntu `22.04` for that case, for two reasons: 1) `22.04` is being rolled out as the new default anyway 2) darshan-hpcgh-851 with the Python garbage collection issues in `3.10` was resolved by using contexts, so special treatment not justified anymore
* the testsuite now always uses `DarshanReport` with a context manager to avoid shenanigans with `__del__` and garbage collection/`pytest`/multiple threads * this appears to fix the problem with testsuite hangs described in darshan-hpcgh-839 and darshan-hpcgh-851
As discussed in gh-830, there is a mysterious "hang" of the
pytest
suite with Python3.10
andubuntu-latest
GitHub actions runner on that branch. Logging in to the runner viassh
allows execution viapytest
as expected, so it is pretty unusual/weird. Furthermore, in tylerjereddy#23 I checked thatpytest
6.2.5
and Python3.10.0
had no effect on the hang (vs. the newer versions of those libs in use for the original hang).I'll keep this issue open as a reference, but since I couldn't reproduce even in the runner itself when working interactively, nor in
act
with Python3.10
andubuntu-latest
, there seems to be no sane way to debug this. It looks like switching to the latest Ubuntu LTS release, using theubuntu-22.04
tag alleviates the issue so I'll probably try applying that to Shane's PR and referencing this issue in a comment.Unless we get related user reports, I wouldn't suggest putting any more time on this one though...
The text was updated successfully, but these errors were encountered: