Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration #10133

Merged
merged 33 commits into from
Nov 7, 2024

Conversation

uniartisan
Copy link
Contributor

@uniartisan uniartisan commented Nov 2, 2024

Overview

This update focuses on two major optimizations for RWKV6 operators:

  1. Standardize operator naming for better code readability
  2. Implement CPU multi-core parallel acceleration to improve inference performance

@github-actions github-actions bot added testing Everything test related Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Nov 2, 2024
@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label Nov 2, 2024
@uniartisan
Copy link
Contributor Author

The SYCL backend of WKV6 is still being tested and may be pushed in the near future

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Nov 2, 2024
@uniartisan uniartisan changed the title Optimize RWKV6 Operator Naming and Implement Multi-core CPU Acceleration Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration Nov 2, 2024
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@airMeng Can someone on your team review the SYCL changes?

ggml/src/ggml-cpu.c Outdated Show resolved Hide resolved
ggml/src/ggml-sycl/outprod.cpp Outdated Show resolved Hide resolved
ggml/src/ggml-sycl/wkv6.cpp Outdated Show resolved Hide resolved
uniartisan and others added 3 commits November 5, 2024 00:42
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
@uniartisan uniartisan requested a review from ggerganov November 4, 2024 13:58
ggml/src/ggml-cpu.c Outdated Show resolved Hide resolved
ggml/src/ggml-cpu.c Show resolved Hide resolved
ggml/src/ggml-cpu.c Outdated Show resolved Hide resolved
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Copy link
Collaborator

@airMeng airMeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM excepts some minor comments

ggml/src/ggml-sycl/concat.cpp Outdated Show resolved Hide resolved
ggml/src/ggml-sycl.cpp Outdated Show resolved Hide resolved
ggml/src/ggml-sycl/outprod.cpp Outdated Show resolved Hide resolved
ggml/src/ggml-sycl/wkv6.cpp Outdated Show resolved Hide resolved
@uniartisan uniartisan requested a review from ggerganov November 4, 2024 15:57
Copy link
Collaborator

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@uniartisan
It's great work! Including to refactor the SYCL backend.
I test the code with base cases. They are passed.

Thank you!

@airMeng airMeng merged commit 3bcd40b into ggerganov:master Nov 7, 2024
53 checks passed
Alcpz added a commit that referenced this pull request Nov 13, 2024
* Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration #10133)

* Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops

* Fixes asserts in norm to fix debug builds.
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
…eleration (ggerganov#10133)

* rwkv6: rename to wkv6

* rwkv6: support avx2 avx512 armv8 armv9

* rwkv6: update cuda file name

* rwkv6: rename params

* wkv on sycl

* sycl: add some ops

* sycl: Enhance OP support judgment

* wkv6: drop armv9 and tranfer to GGML style

ggml-ci

* sync : ggml

* update the function to use appropriate types

* fix define error

* Update ggml/src/ggml-cpu.c

* add appropriate asserts

* move element-wise functions outside

* put the declaration outside the loop

* rewrite to be more inline with the common pattern for distributing threads

* use recommended way GGML_TENSOR_LOCALS

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Meng, Hengyu <airdldl@163.com>
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 15, 2024
* Fixes broken build for the SYCL CUDA backend caused by non-explicit gemm call in outprod (merged in with RWKV6 in
Optimize RWKV6 Operator Naming and Implement Multi-core CPU/ SYCL Acceleration ggerganov#10133)

* Marks permuted MUL_MAT as unsupported to be able to run test-backend-ops

* Fixes asserts in norm to fix debug builds.
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Nov 18, 2024
…eleration (ggerganov#10133)

* rwkv6: rename to wkv6

* rwkv6: support avx2 avx512 armv8 armv9

* rwkv6: update cuda file name

* rwkv6: rename params

* wkv on sycl

* sycl: add some ops

* sycl: Enhance OP support judgment

* wkv6: drop armv9 and tranfer to GGML style

ggml-ci

* sync : ggml

* update the function to use appropriate types

* fix define error

* Update ggml/src/ggml-cpu.c

* add appropriate asserts

* move element-wise functions outside

* put the declaration outside the loop

* rewrite to be more inline with the common pattern for distributing threads

* use recommended way GGML_TENSOR_LOCALS

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Co-authored-by: Diego Devesa <slarengh@gmail.com>
Co-authored-by: Plamen Minev <pacominev@gmail.com>
Co-authored-by: Yuri Khrustalev <ykhrustalev@users.noreply.github.com>
Co-authored-by: Meng, Hengyu <airdldl@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language testing Everything test related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants