-
Notifications
You must be signed in to change notification settings - Fork 7
arch and gencode flags for CUDA builds on NVIDIA
When trying to build the software Gpufit
on Linux with CUDA 11, I received an error that compute_30
was an
"unsupported architecture".
nvcc fatal : Unsupported gpu architecture 'compute_30'
CMake Error at Gpufit_generated_cuda_gaussjordan.cu.o.RELEASE.cmake:222 (message):
Error generating
/home/louis/dev/gpufit_dev/gpufit-build/Gpufit/CMakeFiles/Gpufit.dir//./Gpufit_generated_cuda_gaussjordan.cu.o
make[2]: *** [Gpufit/CMakeFiles/Gpufit.dir/build.make:93:
Gpufit/CMakeFiles/Gpufit.dir/Gpufit_generated_cuda_gaussjordan.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:1114: Gpufit/CMakeFiles/Gpufit.dir/all] Error 2
make: *** [Makefile:95: all] Error 2
This had not been reported in the bug tracker, and when inspecting the cmake output it was taking a "try everything" approach:
-- CUDA_ARCHITECTURES=3.0;3.5;5.0;5.2;3.2;3.7;5.3;6.0;6.1;6.2;7.0+PTX
-- CUDA_NVCC_FLAGS=-gencode;arch=compute_30,code=sm_30;-gencode;arch=compute_35,...
- That 2nd line went on all the way to
compute_70
andsm_70
.
Essentially the CUDA_ARCHITECTURES
, e.g. "3.0" gets interpreted into CUDA_NVCC_FLAGS
such as compute_30
: simple enough.
The problem was that since CUDA 11, compute_30
is deprecated.
- This is documented here
Fermi† Kepler† Maxwell‡ Pascal Volta Turing Ampere Lovelace* Hopper** sm_20 sm_30 sm_50 sm_60 sm_70 sm_75 sm_80 sm_90? sm_100c? sm_35 sm_52 sm_61 sm_72 sm_86 sm_37 sm_53 sm_62 † Fermi and Kepler are deprecated from CUDA 9 and 11 onwards
‡ Maxwell is deprecated from CUDA 12 onwards
* Lovelace is the microarchitecture replacing Ampere (AD102)
** Hopper is NVIDIA’s rumored “tesla-next” series, with a 5nm process.
- The latest series is RTX 30, which have the Ampere architecture and can only use CUDA 11 and upwards.
- As noted, Fermi and Kepler are deprecated from CUDA 9 and 11 upwards (respectively I presume),
i.e. in CUDA 11, Kepler is deprecated, and with it
sm_30
,sm_35
, andsm_37
compute capabilities.
This explains why compute_30
was throwing an error and suggests how to fix it. Simply test if
the CUDA version is greater than or equal to 11, and then skip the architectures from 37 and below.
- More details on which cards have which architectures at the aforementioned link