Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make complex math functions host-device #2479

Merged
merged 1 commit into from
Feb 17, 2025

Conversation

chillenzer
Copy link
Contributor

This is the immediate fix requested in here for the following problem: Compiling in alpaka_ACC_GPU_CUDA_ONLY_MODE=ON resolves ALPAKA_FN_ACC into __device__ (and not __host__ __device__). However, in some situations the compiler has problems deducing the correct return type (for return type auto) if it only ever knows the device code and doesn't generate host code. This results in weird compilation errors along the lines of error: invalid use of void expression for functions that clearly return a non-void value. In our case, the complex math functions are the culprit.

CUDA-only mode is currently not run in CI (presumably because some examples explicitly need AccCpuSerial). @SimeonEhrig, could we run CUDA-only mode in CI?

CAUTION: Running test/unit/math/mathTest in CUDA-only mode still fails after this fix with the following error message:

include/alpaka/kernel/TaskKernelGpuUniformCudaHipRt.hpp(334) 'TApi::funcGetAttributes(&funcAttrs, reinterpret_cast<void const*>(kernelName))' returned error  : 'cudaErrorInvalidDeviceFunction': 'invalid device function'!

[...]

/test/unit/math/src/mathLambda.cpp:60: FAILED:
due to a fatal error condition:
  testing acc:alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::
  integral_constant<unsigned long, 1ul>, unsigned long> data type:double
  functor:__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void
  (LambdaMathTestTemplate<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt,
  std::integral_constant<unsigned long, 1ul>, unsigned long>, double>::*)()
  const, &(void LambdaMathTestTemplate<alpaka::AccGpuUniformCudaHipRt<alpaka::
  ApiCudaRt, std::integral_constant<unsigned long, 1ul>, unsigned long>,
  double>::operator()<std::tuple<mathtest::OpAbs, mathtest::OpAcos, mathtest::
  OpAcosh, mathtest::OpArg, mathtest::OpAsin, mathtest::OpAsinh, mathtest::
  OpAtan, mathtest::OpAtanh, mathtest::OpCbrt, mathtest::OpCeil, mathtest::
  OpCos, mathtest::OpCosh, mathtest::OpErf, mathtest::OpExp, mathtest::OpFloor,
  mathtest::OpLog, mathtest::OpLog2, mathtest::OpLog10, mathtest::OpRound,
  mathtest::OpRsqrt, mathtest::OpSin, mathtest::OpSinh, mathtest::OpSqrt,
  mathtest::OpTan, mathtest::OpTanh, mathtest::OpTrunc, mathtest::OpIsnan,
  mathtest::OpIsinf, mathtest::OpIsfinite> >() const), 1u>, void (mathtest::
  ArgsItem<double, (mathtest::Arity)1> const&, alpaka::AccGpuUniformCudaHipRt
  <alpaka::ApiCudaRt, std::integral_constant<unsigned long, 1ul>, unsigned
  long> const&)> seed:2721678720
  SIGILL - Illegal instruction signal

(running in debug mode generates the SIGILL which seems to be somehow hidden otherwise). This disappears by disabling CUDA-only mode (still activating only a single accelerator). So apparently there's something more going on here. cuda-gdb reveals that it's while getValidWorkDiv failing to get function properties. This is likely related, potentially because I haven't found all instances of this happening. We need to dig into this further but I would still welcome merging this partial fix already because it's needed for PIConGPU.

As @psychocoderHPC is on vacation, would you, @AuroraPerego or someone else, do the honours please?

@fwyzard fwyzard requested a review from AuroraPerego February 17, 2025 12:25
@SimeonEhrig
Copy link
Member

CUDA-only mode is currently not run in CI (presumably because some examples explicitly need AccCpuSerial). @SimeonEhrig, could we run CUDA-only mode in CI?

Do we need to test different CUDA versions or is one fine?

@chillenzer
Copy link
Contributor Author

Do we need to test different CUDA versions or is one fine?

I think a single one would be enough. It might come a time when nvcc learns to handle this correctly at which point we'd need to care about the version more precisely. But a single check would be a good start to build upon. We can still add more later as required.

@AuroraPerego AuroraPerego merged commit 5371ac5 into alpaka-group:develop Feb 17, 2025
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants