Make a PR updating ggml and support CUDA backend #72

FSSRepo · 2023-10-15T23:28:17Z

@leejet I intend to create a pull request that requires me to use the latest version of ggml to utilize ggml-alloc and ggml-backend for adding GPU acceleration to this project. The issue is that I need some feedback to make progress. I'm not sure if you're already working on something to avoid redoing tasks that are already done.

FSSRepo · 2023-10-17T01:00:59Z

@Green-Sky I forked this project, but I want to use a more recent version of ggml, precisely the fork of ggml that I already have, where I've implemented CUDA acceleration for Conv2D operation, and I also want to replace the way tensor memory is managed using ggml-alloc. The problem I'm facing is that I don't know how to replace the submodule pointing to leejet/ggml with FSSRepo/ggml. Can you help me or provide any suggestions?

Green-Sky · 2023-10-17T01:08:02Z

afaik leejet has some other custom changes. or at least had last time i checked.
the easiest way would be to navigate to the submodule in the filesystem,
type git remote add fssrepo <your_fork_url.git>
then git fetch --all and then checkout a branch of yours. if the branch name is duplicated its a bit ugly, it will be something like fssrepo/branchname.

FSSRepo · 2023-10-17T01:22:01Z

I will try it. Thank you!

leejet · 2023-10-22T05:31:47Z

@leejet I intend to create a pull request that requires me to use the latest version of ggml to utilize ggml-alloc and ggml-backend for adding GPU acceleration to this project. The issue is that I need some feedback to make progress. I'm not sure if you're already working on something to avoid redoing tasks that are already done.

I'm glad to hear this news. Although I have plans to add GPU support, I haven't started working on it yet. I've been quite busy with work-related matters recently. Looking forward to your progress.

leejet · 2023-10-22T06:14:38Z

@Green-Sky I forked this project, but I want to use a more recent version of ggml, precisely the fork of ggml that I already have, where I've implemented CUDA acceleration for Conv2D operation, and I also want to replace the way tensor memory is managed using ggml-alloc. The problem I'm facing is that I don't know how to replace the submodule pointing to leejet/ggml with FSSRepo/ggml. Can you help me or provide any suggestions?

And just now, I've synced the latest changes from GGML's upstream to leejet/ggml. You can make your modifications based on this branch https://github.com/leejet/ggml/tree/sd.cpp. The only difference from the latest GGML is the addition of the dynamic mode.

FSSRepo · 2023-10-22T12:49:07Z

The only difference from the latest GGML is the addition of the dynamic mode.

I had done the changes from your dynamic mode to ggml-alloc but the compute graph waste too much memory

For a image 512 x 512 (diffusion compute):

Dynamic mode: 560 MB

ggml-alloc: 1147 MB

Vae decoding compute:

Dynamic mode: ~1700 MB

ggml-alloc: 2496 MB fixed

ggml alloc is needed for implement differents backends

FSSRepo · 2023-10-22T20:41:11Z

The latest commit crashes in my computer (after sync ggml):

Before sync ggml (works):

For some reason ggml crashes when call ggml_build_forward in any program. Using ggml_build_forward_expand this is not ocurr.

Debug in Visual Studio

your ggml version is broken

leejet · 2023-10-23T13:14:47Z

The latest commit crashes in my computer (after sync ggml):

This is because the latest ggml increased the size of 'struct ggml_cplan,' which caused a stack overflow issue on MSVC. I've just submitted a commit to fix this problem. You can give it a try.

FSSRepo · 2023-11-26T11:41:10Z

Closed by #75

FSSRepo changed the title ~~Why don't use ggml-alloc instead dynamic mode that you implemented?~~ Make a PR updating ggml and support CUDA backend Oct 15, 2023

FSSRepo mentioned this issue Oct 22, 2023

stable-diffusion : ggml-alloc integration and gpu acceleration #75

Merged

5 tasks

FSSRepo closed this as completed Nov 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make a PR updating ggml and support CUDA backend #72

Make a PR updating ggml and support CUDA backend #72

FSSRepo commented Oct 15, 2023 •

edited

Loading

FSSRepo commented Oct 17, 2023 •

edited

Loading

Green-Sky commented Oct 17, 2023 •

edited

Loading

FSSRepo commented Oct 17, 2023

leejet commented Oct 22, 2023

leejet commented Oct 22, 2023

FSSRepo commented Oct 22, 2023 •

edited

Loading

FSSRepo commented Oct 22, 2023 •

edited

Loading

leejet commented Oct 23, 2023

FSSRepo commented Nov 26, 2023

Make a PR updating ggml and support CUDA backend #72

Make a PR updating ggml and support CUDA backend #72

Comments

FSSRepo commented Oct 15, 2023 • edited Loading

FSSRepo commented Oct 17, 2023 • edited Loading

Green-Sky commented Oct 17, 2023 • edited Loading

FSSRepo commented Oct 17, 2023

leejet commented Oct 22, 2023

leejet commented Oct 22, 2023

FSSRepo commented Oct 22, 2023 • edited Loading

FSSRepo commented Oct 22, 2023 • edited Loading

leejet commented Oct 23, 2023

FSSRepo commented Nov 26, 2023

FSSRepo commented Oct 15, 2023 •

edited

Loading

FSSRepo commented Oct 17, 2023 •

edited

Loading

Green-Sky commented Oct 17, 2023 •

edited

Loading

FSSRepo commented Oct 22, 2023 •

edited

Loading

FSSRepo commented Oct 22, 2023 •

edited

Loading