-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make a PR updating ggml and support CUDA backend #72
Comments
@Green-Sky I forked this project, but I want to use a more recent version of ggml, precisely the fork of ggml that I already have, where I've implemented CUDA acceleration for Conv2D operation, and I also want to replace the way tensor memory is managed using ggml-alloc. The problem I'm facing is that I don't know how to replace the submodule pointing to leejet/ggml with FSSRepo/ggml. Can you help me or provide any suggestions? |
afaik leejet has some other custom changes. or at least had last time i checked. |
I will try it. Thank you! |
I'm glad to hear this news. Although I have plans to add GPU support, I haven't started working on it yet. I've been quite busy with work-related matters recently. Looking forward to your progress. |
And just now, I've synced the latest changes from GGML's upstream to leejet/ggml. You can make your modifications based on this branch https://github.com/leejet/ggml/tree/sd.cpp. The only difference from the latest GGML is the addition of the dynamic mode. |
I had done the changes from your dynamic mode to For a image 512 x 512 (diffusion compute): Dynamic mode: 560 MB ggml-alloc: 1147 MB Vae decoding compute: Dynamic mode: ~1700 MB ggml-alloc: 2496 MB fixed ggml alloc is needed for implement differents backends |
This is because the latest ggml increased the size of 'struct ggml_cplan,' which caused a stack overflow issue on MSVC. I've just submitted a commit to fix this problem. You can give it a try. |
Closed by #75 |
@leejet I intend to create a pull request that requires me to use the latest version of ggml to utilize ggml-alloc and ggml-backend for adding GPU acceleration to this project. The issue is that I need some feedback to make progress. I'm not sure if you're already working on something to avoid redoing tasks that are already done.
The text was updated successfully, but these errors were encountered: