-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: Re-enable CI on GH200 #1653
Conversation
Tödi seems to be back, we could try to resume this PR without any specific partition/account or time limit. |
@FlorianDeconinck We got a failure in a test case: Test log here: I see that there is a check on CUDA version: gt4py/tests/cartesian_tests/integration_tests/multi_feature_tests/test_code_generation.py Line 588 in 8040178
Maybe this check is not enough? |
Hey Enrique, I have a fix for that. Extent interval check is indeed broken for K and the test itself is bad. I can PR a quick fix to the test or you can change it as part of your PR def column_physics_conditional(A: Field[np.float64], B: Field[np.float64], scalar: np.float64):
- with computation(BACKWARD), interval(1, None):
+ with computation(BACKWARD), interval(1, -1):
if A > 0 and B > 0:
A[0, 0, -1] = scalar
B[0, 0, 1] = A
lev = 1
while A >= 0 and B >= 0:
A[0, 0, lev] = -1
B = -1
lev = lev + 1 This should fix the test. The CUDA version test was a previous misunderstanding of where the race condition could come from |
There is no hurry from our side, you can open a fix PR when you have time. I suspect that this test failure is flaky, it only happens sometimes. |
PR open there: #1791 |
Looks like the dace problem is still there, even on CUDA 12: https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4455690602105886/4525297225819146/-/jobs/8815984483 |
cscs-ci run |
Alright been enough failed attempt at fixing this - I'll PR a complete deactivation of the feature today and we will go back to the drawing board to figure out what we are clearly not understanding. |
@@ -583,6 +583,11 @@ def test_K_offset_write(backend): | |||
if backend == "cuda": | |||
pytest.skip("cuda K-offset write generates bad code") | |||
|
|||
if backend == "dace:gpu": | |||
pytest.skip( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@FlorianDeconinck I have deactivated the test case this way. I was not sure whether to refer to issue #1684 or #1754.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, January has been idiotically overbooked and I keep falling off. I'll take it from there, sorry for the delay again
Thanks @edopao |
No description provided.