Skip to content

Commit

Permalink
HIP: force max threads per block to be 1024
Browse files Browse the repository at this point in the history
Some old compilers still use 256. Explicitly set it to 1024 to get correct
result from ops like ARGMAX and GROUP_NORM.

Related: #10610, #11619
Signed-off-by: fxzjshm <fxzjshm@163.com>
  • Loading branch information
fxzjshm committed Feb 3, 2025
1 parent d92cb67 commit 7e596d4
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions ggml/src/ggml-hip/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,9 @@ find_package(hip REQUIRED)
find_package(hipblas REQUIRED)
find_package(rocblas REQUIRED)

# Workaround old compilers
set(CMAKE_HIP_FLAGS "${CMAKE_HIP_FLAGS} --gpu-max-threads-per-block=1024")

if (${hip_VERSION} VERSION_LESS 5.5)
message(FATAL_ERROR "At least ROCM/HIP V5.5 is required")
endif()
Expand Down

0 comments on commit 7e596d4

Please sign in to comment.