Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ggml : reintegrate the AMX backend into the CPU backend #10359

Closed
ggerganov opened this issue Nov 17, 2024 · 1 comment
Closed

ggml : reintegrate the AMX backend into the CPU backend #10359

ggerganov opened this issue Nov 17, 2024 · 1 comment
Labels

Comments

@ggerganov
Copy link
Owner

As explained here #10343 (comment), we would like to keep the CPU implementations inside the CPU backend. The AMX backend was created mainly because at the time we didn't support runtime weight repacking. Since now this functionality is supported, we should merge the AMX backend into the CPU backend.

The rough plan to achieve that is outlined here: #10350 (reply in thread)

The plan to reintegrate the AMX backend would be to create a new buffer type that converts the weights to the layout that the AMX backend needs them, and then check in the matrix multiplication the buffer type to determine if the AMX matrix multiplication code should be used. Basically extending the same that is done in #9921 for the aarch64 types.

Copy link
Contributor

github-actions bot commented Jan 1, 2025

This issue was closed because it has been inactive for 14 days since being marked as stale.

@github-actions github-actions bot closed this as completed Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant