-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
'func.func' op uses 873872 bytes of shared memory; exceeded the limit of 65536 bytes using LLVMGPUSIMT #18905
Comments
I am realizing that the matmul-like op shared in the gist in the issue is quite egde-casey so we might want to look at the whole model and think why we have this matmul-like op and should we have done something differently in pre-processing to not reach this op. Here is the front-end program causing this shape
|
The matmul is coming from %13 = linalg.conv_2d_nchw_fchw {dilations = dense<1> : vector<2xi64>, strides = dense<1> : vector<2xi64>} ins(%12, %4 : tensor<1x64x56x56xf32>, tensor<128x64x1x1xf32>) outs(%broadcasted_3 : tensor<1x128x56x56xf32>) -> tensor<1x128x56x56xf32> And then generalized to the |
Thanks for taking a look, I dont think there is any reason it cant be supported but maybe just needs new configuration logic and see if anything breaks, so we can take a look at adding that. |
So far my progress is:
Next steps:
|
Yeah, this is something that should be handled by compiler/src/iree/compiler/DispatchCreation/CollapseDimensions.cpp but its currently blocked by some codegen issues (should be resolved shortly (#18822): iree/compiler/src/iree/compiler/DispatchCreation/CollapseDimensions.cpp Lines 174 to 189 in 9650bfe
I quickly commented this line out and was able to get it to compile successfully |
@IanWood1 Thanks for putting extra effort in getting it to compile. It is not immediately clear to me which line you've commented out is able to make it work. But I assume you've made it such that 56x56 will be collapsed to a single dimension. With that single dimension I chatted with @nirvedhmeshram a moment ago, I think a few things still don't fully explain itself and worth following through on:
|
Summary and conclusions:
For next steps this ticket is on hold, resolution of either below will make the problem in this ticket go away:
|
For this matmul like + elementwise IR, we go down the LLVMGPUSIMT pipeline, see dump here . Today TileandFuse Vectorize can handle this case correctly but ideally we want this to be handled by TileandFuse Matmul pipeline.
Compile command for SIMT
Compile command for TileandFuse Vectorize
The text was updated successfully, but these errors were encountered: