Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
Bodhi Hu committed Feb 19, 2025
1 parent 8a68656 commit f24378f
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 8 deletions.
8 changes: 2 additions & 6 deletions docs/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -202,17 +202,14 @@ This provides GPU acceleration using the MUSA cores of your Moore Threads MTT GP
- Using `CMake`:

```bash
# build with MUSA and using the compilers from MUSA SDK:
cmake -B build -DGGML_MUSA=ON \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
cmake -B build -DGGML_MUSA=ON
cmake --build build --config Release
```
- For static build:

```bash
cmake -B build -DGGML_MUSA=ON \
-DBUILD_SHARED_LIBS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON \
-DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++
-DBUILD_SHARED_LIBS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON
cmake --build build --config Release
```

Expand All @@ -222,7 +219,6 @@ The environment variable `GGML_CUDA_ENABLE_UNIFIED_MEMORY=1` can be used to enab

Most of the compilation options available for CUDA should also be available for MUSA, though they haven't been thoroughly tested yet.


## HIP

This provides GPU acceleration on HIP-supported AMD GPUs.
Expand Down
2 changes: 1 addition & 1 deletion ggml/src/ggml-cuda/common.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -404,7 +404,7 @@ static __device__ __forceinline__ int ggml_cuda_dp4a(const int a, const int b, i

#if __CUDA_ARCH__ >= GGML_CUDA_CC_DP4A || defined(GGML_USE_MUSA)
return __dp4a(a, b, c);
#else
#else // __CUDA_ARCH__ >= GGML_CUDA_CC_DP4A || defined(GGML_USE_MUSA)
const int8_t * a8 = (const int8_t *) &a;
const int8_t * b8 = (const int8_t *) &b;
return c + a8[0]*b8[0] + a8[1]*b8[1] + a8[2]*b8[2] + a8[3]*b8[3];
Expand Down
2 changes: 1 addition & 1 deletion src/llama-model.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3703,7 +3703,7 @@ void llama_model::print_info() const {
}

if (arch == LLM_ARCH_LLAMA) {
LLAMA_LOG_INFO("%s: expert_weights_scale = %.1f\n", __func__, hparams.expert_weights_scale);
LLAMA_LOG_INFO("%s: expert_weights_scale = %.1f\n", __func__, hparams.expert_weights_scale);
}

vocab.print_info();
Expand Down

0 comments on commit f24378f

Please sign in to comment.