-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vulkan : reuse parent extra for views #7806
Conversation
This does fix the issue for me! |
i can confirm it fixed my issue |
This is what changed (from #7730 (comment)): @0cc4m this is probably my bad, I made some changes to the way views are initialized in ggml-backend that may have created this issue. Views are now initialized in the buffer of their parent tensor, instead of on the compute buffer. The reason I made this change is because I came to the conclusion that allocating views on the compute buffer cannot work reliably because the compute buffer is not always of the same type as the buffer used to allocate the tensor originally, and backends should be able to use the same extra as their parent anyway. I thought it was safe to make this change because the CUDA backend no longer needs extras for normal buffers, but I didn't realize that the vulkan backend still does. Looking at the |
Sounds right, but it should be either an offset addition on using the tensor or initially, not both, so can the part that I linked be removed? I can't test it today, but I should be able to tomorrow. Thanks for looking into it. |
I already changed the part that you linked to reuse the extra of the parent tensor instead of allocating a new one. The problem is that without doing this, the extras are allocated in the buffer of the KV cache, since these are views of the KV tensors, and eventually overwrite the extras of the original tensors. To prevent this it is necessary either to allocate the views in the compute buffer (what happened before), or simply avoid allocating extras for views, which is what this change does. |
Seems to be ok for me. Thanks! |
Right, sorry. I just glanced over the code and missed that. |
This causes a validation error when run with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Found the issue and fixed it.
Should fix #7730 #7769
@rhjdvsgsgks @metal3d @stduhpf Can you check if this fixes the issue?