-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JOSS REVIEW] CUDA Errors in Arch Linux #598
Comments
@upsj has helpfully pointed out that this overlaps with another (closed) issue (#579 ) and could be a consequence of the latest-stable nature of Arch's package repositories. I will try the given workaround and report back. If a documented workaround exists, that is acceptable for the purposes of functionality. |
Indeed, as pointed out in linked discussions, CUDA generally supports gcc8 when it comes to gcc. Arch does install gcc8 as a dependency for CUDA, but as an explicit command I tried using the documented host compiler option:
However, this proves insufficient. I get linking errors again. However, by forcing gcc-8 globally:
I am able to successfully build. I will now verify that the executables behave as documented and exercise the GPU, but the successful build is promising for the completion of the functionality review. Thanks for the timely pointing out of what's already known about this issue! |
The build goes well, however the I confirm through To verify my CUDA install, I built, executed, and profiled a simple test program from NVIDIA using nvcc (obviously this is much simpler than a ginkgo build!). So, my CUDA install is not clearly broken, at least. Is there a way the developers recommend for producing some debug logs to help diagnose why the CUDA solver is hanging? For reference, here's smi:
|
This may be unrelated, but when you were having build errors earlier, you mentioned warnings about CUDA architectures being empty. What is your value for the CMake variable |
@nbeams That's a good observation. Unfortunately, the warnings are probably unrelated to the problems, since they don't come from Ginkgo's CUDA architecture selection (CudaArchitectureSelector), but from the most recent release of CMake, where the same capabilities have been added natively. Before we try to debug this at runtime, can you post the contents of your CMakeCache.txt file for us to see the whole build configuration? |
Ah wait, I realized what the issue is:
|
@adam-m-jcbs The |
@upsj ah, yes, thank you! I was not running the solver with input, so makes sense it would hang! That's on me for not reading the docs carefully enough. I'll read the docs more carefully for the other tests (notably the 27 pt stencil). I would recommend going through the After providing the proper input, I can confirm the build as given works and yields a well-behaving @thoasm Thanks for the extra info. As I understand it, Arch's package manager installs CUDA such that it natively utilizes the gcc-8 dependency (for C and C++) without intervention. Thus, I found I did not need to set the CUDA host compiler, but it's possibly good to do just for safety, as you say. I agree it's better practice and a bit safer to use the Thus, I rewrote my build command:
But the issue proved to be the simpler one: programs don't behave well when you don't provide expected input! @nbeams thanks for the info! I believe @upsj is correct, though, that this is not the issue. I was wondering about the CUDA architecture variable, but figured it was more relevant for devs. And though things are working, for reference here is the Thanks for much for all the timely input from the devs! This issue is resolved and closed. |
This issue is part of the functionality aspect of a JOSS review (see #597)
I am attempting to build gingko on my local machine with CUDA and OMP on. However, I seem to run into some issues. I will report them below to see if anyone has an idea for a solution, and will continue to debug the issue on my machine to see if I discover anything.
After a fresh clone of
ginkgo
, in a build directory I execute a script containing a build command similar to the debug build (and I install locally in userspace, not to the system):You can find the full log of this command's output here. Mainly, I get a CUDA-related error while linking like:
I also early on get warnings like this:
If I do a build without OMP and CUDA, I can successfully build and tests run with reasonable output.
My system:
The text was updated successfully, but these errors were encountered: