-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent compilation failures due to std::_Select utility #792
Comments
To summarize the discussion on the linked issue -- it looks like there is an issue with MSVC's STL in some usage of its internal Some details from my attempts to track this down: The line that the issue refers to is an expansion of this macro:
With these inputs:
(Let's please just ignore the reserved This is a standard and common MPL pattern to test for a member type alias, and does not invoke any STL traits, etc. The
Maybe it's being used in a _Select call but the implicit conversion isn't firing? Otherwise, I don't see where this bad The macros used above expand to nothing surprising:
The error messages don't provide any other breadcrumbs to track down, and only contains unexpanded local template typename parameters like I'll keep following this issue in case Thrust needs to follow up with this, though this seems like a toolchain / environment issue since it is intermittent and also reported to affect Eigen headers in pytorch, too. We have not been able to reproduce this issue in Thrust directly. |
This is definitely an unusual compiler bug - 99.9% of the time, when the STL triggers compiler bugs, they are totally deterministic.
It looks like the error might be emitted by the MSVC compiler, after the CUDA compiler has split up the source code for the host vs. device compilers? (My understanding of the CUDA compilation flow is still very vague.) If so, is it possible to capture the source code sent to MSVC when the error happens? It would be useful to know if the intermittent behavior is happening before or after MSVC processes the code. Separately, I observe that the As for Lines 1481 to 1494 in 65d98ff
Lines 736 to 739 in 65d98ff
Lines 966 to 971 in 65d98ff
Lines 1011 to 1017 in 65d98ff
Lines 1204 to 1209 in 65d98ff
We're using |
Good idea. @leezu @mikoro Could one of you try this? According to this, it looks like the intermediate version can be obtained by:
It's good to know that
The allocator usage might be relevant, since the error points at Thrust's
That'd be great -- until we figure this out, it'll be helpful to use this issue to discuss any MSVC-related issues that come up. |
In light of NVIDIA/thrust#1090 (comment), this can be closed. Thanks for the help! |
Glad you finally ran the bug down - this was a nasty one. Let us know if there's anything else we can do. |
Describe the bug
Various Nvidia software (Thrust, CUB) causes intermittent compilation failures with MSVC 2019.
All come in the style of
::_Select<__formal>::_Apply': 'X' is not a valid template type argument for parameter '<unnamed-symbol>'
. Would this be a bug in MSVC / STL or do you have any recommendations about how NVidia should adapt their software for MSVC support?This affects many projects using MSVC and Cuda, such as MXNet and PyTorch.
There are more examples of the intermittent failures at NVIDIA/thrust#1090
The text was updated successfully, but these errors were encountered: