Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cartesian] K-offset writes bound check is failing #1754

Open
2 tasks
FlorianDeconinck opened this issue Nov 27, 2024 · 4 comments
Open
2 tasks

[cartesian] K-offset writes bound check is failing #1754

FlorianDeconinck opened this issue Nov 27, 2024 · 4 comments

Comments

@FlorianDeconinck
Copy link
Contributor

FlorianDeconinck commented Nov 27, 2024

Consider the test column_physics_conditional in tests/cartesian_tests/integration_tests/multi_feature_tests/test_code_generation.py. A fix in https://github.com/GridTools/gt4py/pull/1791 moved the interval to interval(1,-1) in line with the access and the domain defined.

But a previous version had the interval at inteval(1,None) which leads to a race condition and an OOB.

The stencil mix variable offset in K, where OOB can't be detected (see #1684), and scalar offset which should be detected.

  • Reproduce with a smaller, simpler example
  • Enforce bounds or error out

Original ticket:

See failure on gt:gpu here; https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4455690602105886/4525297225819146/-/jobs/8381258008

Translate tests is test_K_offset_write_conditional and should be modified when true fix comes into place.

Temporary fix to deactivate the feature: #1755

@FlorianDeconinck
Copy link
Contributor Author

PR for temp fix: #1755

FlorianDeconinck added a commit that referenced this issue Nov 28, 2024
Following the issue logged as
#1754 we are deactivating the
K-offset write feature until we can figure out why it's failing.

I will monitor any activity on the ticket if users are hit by this.

---------

Co-authored-by: Hannes Vogt <hannes@havogt.de>
@edopao
Copy link
Contributor

edopao commented Nov 28, 2024

@FlorianDeconinck
Copy link
Contributor Author

Sorry @edopao that's a mistake on my end fixing that asap

@FlorianDeconinck
Copy link
Contributor Author

Fixing the utest in #1791

Underlyign problem is that the inteval(1, None) should have failed hard on argument validation

@FlorianDeconinck FlorianDeconinck changed the title [cartesian] K-offset writes fail on GPU [cartesian] K-offset writes bound check is failing Jan 9, 2025
havogt pushed a commit that referenced this issue Jan 10, 2025
The `interval` analysis in the unit test
`test_K_offset_write_conditional` fails to catch a mistake in the code
that leads to a race condition.

Work:
- Fix the bad interval
- Remove not needed restriction on CUDA version

Further work to fix the underlying problem and the larger issue of bound
check on variable indexing is covered
[here](#1684) and
[there](#1754)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants