Implement graceful error handling for allocation failures #88

bonachea · 2024-03-27T20:04:49Z

In the near term, Caffeine will likely treat most errors as immediately fatal (ideally with a high-quality message as part of the error crash).

However one particularly important error that IMO should not get this treatment is memory allocation. Unlike hardware failure, out-of-memory is a common condition when scaling problems in real production science, and needs to be handled in a robust manner by a production-quality runtime. It's even plausible that some applications might perform non-trivial recovery from allocation failure.

prif_allocate and prif_allocate_non_symmetric currently ignore the possibility of errors and I suspect they crash in obscure ways upon memory exhaustion. IMO these two calls should be fixed to strictly adhere to Fortran error handling semantics, specifically wrt returning meaningful stat and errmsg (when provided) or crashing with a useful console message (when not provided). Ideally the error message in either case should include status information about the initial and current state of the shared heaps, and recommendations to the end-user about how to resolve the problem.

The text was updated successfully, but these errors were encountered:

bonachea · 2024-08-09T19:37:28Z

duplicate of #128

bonachea added the priority=low label Apr 4, 2024

bonachea closed this as completed Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement graceful error handling for allocation failures #88

Implement graceful error handling for allocation failures #88

bonachea commented Mar 27, 2024

bonachea commented Aug 9, 2024

Implement graceful error handling for allocation failures #88

Implement graceful error handling for allocation failures #88

Comments

bonachea commented Mar 27, 2024

bonachea commented Aug 9, 2024