-
Notifications
You must be signed in to change notification settings - Fork 23.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: _int_mm_out_cuda not compiled for this platform. #130928
Comments
Tenatively grabbing for myself to get a repro, as there are no platform specific guards in this code, just one hiding the code behind CUDA version |
People are clearly using this, and I'm confused because nowhere is reported to do anything special. Can it maybe depends on the C++ compiler? |
@mattiadg it seems |
New output of collect_env.py, still same result PyTorch version: 2.3.1+cu121 OS: Microsoft Windows 11 Home Python version: 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] (64-bit runtime) CPU: Versions of relevant libraries: |
The discussion continued a bit here huggingface/optimum-quanto#245 and @dacorvo suggested that the operation may not be compiled on Windows. |
Given the ifdef in the code: pytorch/aten/src/ATen/native/cuda/Blas.cpp Line 775 in d8a35d5
The issue is most likely that this was compiled for an old version of cuda on windows. cc @eqy this still looks suspicious, maybe this condition doesnt work? |
Is this with a wheel or a source build? Since it's showing the second message, it looks like |
from pip |
Hmm, I can not reproduce it using 2.4 release candidate
And in 2.3 it indeed was disabled for Windows platform: pytorch/aten/src/ATen/native/cuda/Blas.cpp Line 739 in 63d5e92
But this constraint was lifted by #125792 |
🐛 Describe the bug
Hi all, I have encountered this issue while trying to work with models quantized to 8 bits. For instance, I want to add an example to optimum-quanto and when running the quantized model I get the error in the subject
RuntimeError: _int_mm_out_cuda not compiled for this platform.
, which just happens when callingtorch._int_mm
.There are multiple tests in the project using this function and all of them fail with the same error.
I guess it should just work, but I probably have something wrong in my setup.
Versions
PyTorch version: 2.3.1+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Microsoft Windows 11 Home
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A
Python version: 3.8.10 (tags/v3.8.10:3d8993a, May 3 2021, 11:48:03) [MSC v.1928 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22631-SP0
Is CUDA available: True
CUDA runtime version: 12.1.66
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060
Nvidia driver version: 551.83
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture=9
CurrentClockSpeed=2100
DeviceID=CPU0
Family=198
L2CacheSize=12288
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2100
Name=12th Gen Intel(R) Core(TM) i7-12700F
ProcessorType=3
Revision=
Versions of relevant libraries:
[pip3] mypy-extensions==1.0.0
[pip3] numpy==1.24.4
[pip3] torch==2.3.1+cu121
[pip3] torchaudio==2.3.1+cu121
[pip3] torchvision==0.18.1+cu121
[conda] Could not collect
cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @malfet @seemethere @peterjc123 @mszhanyi @skyline75489 @nbcsm @vladimir-aubrecht @iremyux @Blackhex @cristianPanaite @ptrblck
The text was updated successfully, but these errors were encountered: