AMD GPU Misbehavior w/ some drivers (post GGUF update) #1507

apage43 · 2023-10-13T04:44:31Z

System Info

This is specifically tracking issues that still happen after 2.5.0-pre1 which fixes at least some AMD device/driver combos that were reported broken in #1422 - readd them here if they persist after the GGUF update

Repeated same token ###### reported with 2.5.0-pre1 on
- Radeon RX 6600 XT / Windows / driver 2.0.270 - https://vulkan.gpuinfo.org/displayreport.php?id=24844#device
- Radeon RX 6800 XT / Linux (AMDVLK) version ?? - same device reportedly works correctly with the RADV driver
- Radeon RX 6900 XT / Windows AMD RX 6900 XT is selected but not used #1437 (comment)

The text was updated successfully, but these errors were encountered:

manyoso · 2023-10-13T14:18:28Z

Will be trying to replicate

mau777pirho · 2023-10-13T14:40:43Z

AMD RX 7900XT, driver 23.10.1

shiloh92 · 2023-10-14T22:59:54Z

Without any changes, now GPT4all crashes with graphics driver issues - RTX gfx

PedzacyKapec · 2023-10-17T04:54:40Z

Without any changes, now GPT4all crashes with graphics driver issues - RTX gfx

Always thought Vulcan only works with Nvidia only...

Dleewee · 2023-10-19T19:03:11Z

GPU: Radeon RX 6800XT
Drive: PRO 23.Q3

Client 2.4.19:
Results with llama-2-7b-chat.ggmlv3.q4_0: Giving mostly "######" repeating pound/hash sign for answer
Results with orca-mini-7b.ggmlv3.q4_0: Shown in image above, somewhat random output with some real words sprinkled in

Edit: Installed latest consumer driver 23.10.2, giving "######" repeating pound/hash sign on both models mentioned above.

Client 2.5.0-pre1:
Downloaded Mini Orca Small (orca-mini-3b-gguf2-q4_0.gguf) and still seeing the same result (### answers).

Client 2.5.0-pre2:
Application Crash when requesting answer with Mini Orca Small.

cebtenzzre · 2023-10-24T23:03:33Z

We received another report of this issue from Kongming on Discord with GPT4All v2.5.1:

RX 6600 XT / Windows 11 / Driver Version 23.10.24.03-230824a-395232C-AIB

mrdevolver · 2023-10-24T23:09:56Z

Radeon RX Vega 56 / Windows 10 / Driver Version 23.19.02-230831a-396094C-AMD-Software-Adrenalin-Edition (currently the latest available for this model of GPU)

It seems like the GPU is not being used at all. Is it supported at all?

cebtenzzre · 2023-10-24T23:29:58Z

It seems like the GPU is not being used at all. Is it supported at all?

It should be supported. Is it available to select in the UI, and does it report use of the device in the bottom-right while generating output? #1425 is for unsupported GPUs. If you see the hashes, your GPU is being used, but running into a GPT4All bug.

Dleewee · 2023-10-25T01:12:48Z

I can confirm in my case the GPU is definitely being used. By watching task manager I see notable vram and GPU usage when testing a compatible model.

Also, even though the output is gibberish, it is generating much faster on the GPU, like 10x faster.

manyoso · 2023-10-26T13:51:18Z

I can reproduce this with my AMD Radeon (TM) Vega 8 Graphics iGPU running Mini Orca (small) !!!

manyoso · 2023-10-26T13:56:32Z

Turning on validation produces a whole bunch of this:

VUID-vkCmdDispatch-groupCountX-00386(ERROR / SPEC): msgNum: -1903005642 - Validation Error: [ VUID-vkCmdDispatch-groupCountX-00386 ] Object 0: handle = 0x1864b7360c0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x8e927036 | vkCmdDispatch(): groupCountX (155520) exceeds device limit maxComputeWorkGroupCount[0] (65535). The Vulkan spec states: groupCountX must be less than or equal to VkPhysicalDeviceLimits::maxComputeWorkGroupCount[0] (https://vulkan.lunarg.com/doc/view/1.3.261.1/windows/1.3-extensions/vkspec.html#VUID-vkCmdDispatch-groupCountX-00386)

manyoso · 2023-10-26T14:22:14Z

We have several kernels that are exceeding the device limit for workgroup count on my AMD card above ^^^

Specifically, (one of silu,relu,gelu) where we request a workgroup size of 112320 whereas the device limit is 65535 in any one dimension.

The mul kernels also exceeds where we attempt again a workgroup size of 112320

manyoso · 2023-10-26T16:04:22Z

Also running into this validation error:
VUID-VkComputePipelineCreateInfo-layout-07987(ERROR / SPEC): msgNum: -1832049290 - Validation Error: [ VUID-VkComputePipelineCreateInfo-layout-07987 ] Object 0: handle = 0xb3ee8b0000000070, type = VK_OBJECT_TYPE_SHADER_MODULE; Object 1: handle = 0x44695a0000000071, type = VK_OBJECT_TYPE_PIPELINE_LAYOUT; | MessageID = 0x92cd2576 | vkCreateComputePipelines(): pCreateInfos[0] VK_SHADER_STAGE_COMPUTE_BIT has a push constant buffer Block with range [0, 16] which outside the pipeline layout range of [0, 12]. The Vulkan spec states: If a push constant block is declared in a shader, a push constant range in layout must match both the shader stage and range
https://vulkan.lunarg.com/doc/view/1.3.261.1/windows/1.3-extensions/vkspec.html#VUID-VkComputePipelineCreateInfo-layout-07987

manyoso · 2023-10-26T16:53:58Z

The final validation error I'm getting I'm afraid is the real culprit. The problem seems to be on AMD that the driver for some reason will allow us to allocate more memory than the heap can actually supply. It doesn't give any error or indication that the allocation failed.

manyoso · 2023-10-26T17:49:40Z

All three validation errors are now fixed. However, I'm leaving this open until I see someone successfully run AMD Radeon on Windows

manyoso · 2023-10-26T18:37:13Z

This is an offline installer for windows that has all three validation bugs fixed... Need intrepid testers to see if they can successfully get inference on AMD GPU with this build: https://output.circle-artifacts.com/output/job/18f8093e-9e34-4293-b551-478c9163eee4/artifacts/0/build/upload/gpt4all-installer-win64.exe

birkoffe · 2023-10-26T19:17:48Z

Radeon RX 6700 + 23.10.2 doesn't help

Noremacam · 2023-10-26T19:30:56Z

I confirmed the issue still happens for me with the Radeon7900XTX with the build manyoso provided. I however am running the test drivers AMD released for FSR 3, 23.30.01.02.

I'll be happy to help with any further testing.

harish0201 · 2023-10-26T19:52:54Z

Gibberish on using Mistral with the Vulkan backend. I'm using a 6800M with Adrenalin 23.10.2 driver set. Not surprising since 6700 and this are the same ISA

Also a really bad question to the other folks here, do you also get a selection box like this:

or when you do vulkaninfo, do you get multiple devices? Note that this is a laptop with a gfx90c integrated (A)GPU and a discrete gfx1031 GPU:

vulkaninfo.exe --summary
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
==========
VULKANINFO
==========

Vulkan Instance Version: 1.3.261


Instance Extensions: count = 13
-------------------------------
VK_EXT_debug_report                    : extension revision 10
VK_EXT_debug_utils                     : extension revision 2
VK_EXT_swapchain_colorspace            : extension revision 4
VK_KHR_device_group_creation           : extension revision 1
VK_KHR_external_fence_capabilities     : extension revision 1
VK_KHR_external_memory_capabilities    : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2       : extension revision 1
VK_KHR_portability_enumeration         : extension revision 1
VK_KHR_surface                         : extension revision 25
VK_KHR_win32_surface                   : extension revision 6
VK_LUNARG_direct_driver_loading        : extension revision 1

Instance Layers: count = 4
--------------------------
VK_LAYER_AMD_switchable_graphics AMD switchable graphics layer 1.3.262  version 1
VK_LAYER_AMD_switchable_graphics AMD switchable graphics layer 1.3.260  version 1
VK_LAYER_VALVE_steam_fossilize   Steam Pipeline Caching Layer  1.3.207  version 1
VK_LAYER_VALVE_steam_overlay     Steam Overlay Layer           1.3.207  version 1

Devices:
========
GPU0:
        apiVersion         = 1.3.260
        driverVersion      = 2.0.279
        vendorID           = 0x1002
        deviceID           = 0x1638
        deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
        deviceName         = AMD Radeon(TM) Graphics
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 23.9.2 (AMD proprietary shader compiler)
        conformanceVersion = 1.3.3.1
        deviceUUID         = 00000000-0700-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000
GPU1:
        apiVersion         = 1.3.262
        driverVersion      = 2.0.283
        vendorID           = 0x1002
        deviceID           = 0x1638
        deviceType         = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
        deviceName         = AMD Radeon(TM) Graphics
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 23.9.2 (AMD proprietary shader compiler)
        conformanceVersion = 1.3.3.1
        deviceUUID         = 00000000-0700-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000
GPU2:
        apiVersion         = 1.3.260
        driverVersion      = 2.0.279
        vendorID           = 0x1002
        deviceID           = 0x73df
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD Radeon RX 6800M
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 23.10.2 (AMD proprietary shader compiler)
        conformanceVersion = 1.3.3.1
        deviceUUID         = 00000000-0300-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000
GPU3:
        apiVersion         = 1.3.262
        driverVersion      = 2.0.283
        vendorID           = 0x1002
        deviceID           = 0x73df
        deviceType         = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
        deviceName         = AMD Radeon RX 6800M
        driverID           = DRIVER_ID_AMD_PROPRIETARY
        driverName         = AMD proprietary driver
        driverInfo         = 23.10.2 (AMD proprietary shader compiler)
        conformanceVersion = 1.3.3.1
        deviceUUID         = 00000000-0300-0000-0000-000000000000
        driverUUID         = 414d442d-5749-4e2d-4452-560000000000

mau777pirho · 2023-10-26T21:18:22Z

Dleewee · 2023-10-26T21:26:54Z

Gibberish on using Mistral with the Vulkan backend. I'm using a 6800M with Adrenalin 23.10.2 driver set. Not surprising since 6700 and this are the same ISA

Also a really bad question to the other folks here, do you also get a selection box like this:

or when you do vulkaninfo, do you get multiple devices? Note that this is a laptop with a gfx90c integrated (A)GPU and a discrete gfx1031 GPU:

Single GPU shown in "vulkaninfo --summary" output as well as in device drop-down menu.
Testing offline 2.5.2 build on desktop PC with RX6800XT, Windows 10, 23.10.2 driver, Orca Mini model, yields same result as others: "#####"

Dleewee · 2023-10-27T12:53:42Z

Hi - thank you to the dev(s) looking into this issue for AMD GPU owners. We appreciate your time and efforts.

I notice the issue title mentions "with some drivers" and wonder if there is a specific driver version that is known to work?

cebtenzzre · 2023-10-27T13:16:36Z

I notice the issue title mentions "with some drivers" and wonder if there is a specific driver version that is known to work?

RADV on Linux is the configuration that we have been able to get working so far.

manyoso · 2023-10-27T14:01:29Z

FYI, I'm now able to successfully reproduce this issue on AMD Radeon 6800 XT on LINUX with the amdvlk driver and am looking to fix.

manyoso · 2023-10-27T14:31:55Z

FINALLY! it is a synchronization issue. When i define 'record' as 'eval' at the top of ggml-vulkan.cpp I get correct generation, but of course it is too slow. Now we finally have the right clue!!!

manyoso · 2023-10-27T15:43:10Z

index cad334f..dc39cdc 100644
--- a/kompute/src/OpAlgoDispatch.cpp
+++ b/kompute/src/OpAlgoDispatch.cpp
@@ -32,9 +32,9 @@ OpAlgoDispatch::record(const vk::CommandBuffer& commandBuffer)
          this->mAlgorithm->getTensors()) {
         tensor->recordPrimaryBufferMemoryBarrier(
           commandBuffer,
-          vk::AccessFlagBits::eTransferWrite,
+          vk::AccessFlagBits::eShaderWrite,
           vk::AccessFlagBits::eShaderRead,
-          vk::PipelineStageFlagBits::eTransfer,
+          vk::PipelineStageFlagBits::eComputeShader,
           vk::PipelineStageFlagBits::eComputeShader);
     }```

Fixes it but this might affect generation speed...

manyoso · 2023-10-27T16:25:48Z

https://output.circle-artifacts.com/output/job/b7ff15c3-377d-4d27-9dc0-c6503ec5a2b0/artifacts/0/build/upload/gpt4all-installer-win64.exe

Heree is a new offline installer that people can test to see if the recent bugfix resolves the issue. Please let me know!

harish0201 · 2023-10-27T16:45:50Z

Can confirm this works. I purged the newer drivers that I had installed and reinstalled the older drivers. I'm on 23.8.2 Adrenalin Drivers now, using the build that manyoso posted above on Mistral OpenOrca 7B Q4_0.

This is 6800M laptop, which matches the ISA gfx1031. I get about 25-30 tokens/s

Edit: This doesn't mean I can load Q5_K_M models or larger with 7B/13B using this though. That still works on the CPU

birkoffe · 2023-10-27T17:00:14Z

windows 10 + radeon 6700 + adrellian 23.10.2 - works for me now

Dleewee · 2023-10-27T18:24:23Z

6800XT, Windows 10, 23.10.2 driver, Orca Mini Small.
Using the latest test file it is working to some extent but I still notice some odd behavior. In multiple interactions I am seeing the 1st character of the 1st line printing either a square character, or sometimes just a random word not related to the rest of the output.

Example:

The good news is I am seeing a solid 3x speedup over using CPU. Here are some results when running the same query twice:
CPU: 16 tokens/s
GPU: 57 tokens/s
Notable improvement!

Edit: I don't see the "square" or "random word" when I switch to CPU-only generation.

mau777pirho · 2023-10-27T19:27:35Z

https://output.circle-artifacts.com/output/job/b7ff15c3-377d-4d27-9dc0-c6503ec5a2b0/artifacts/0/build/upload/gpt4all-installer-win64.exe

Heree is a new offline installer that people can test to see if the recent bugfix resolves the issue. Please let me know!

Está última version funciona correctamente con el modelo "Mistral Instruct".

But not with the "GPT4All Falcon" model.

My driver version is 23.10.2.

manyoso · 2023-10-27T20:38:17Z

But not with the "GPT4All Falcon" model.

My driver version is 23.10.2.

Interesting! This is actually observable with our current release 2.5.1 so it would seem this is a different bug. It is not the same as this one because the output is ;;;; and not #### consistently.

Also, the problem with the first characters appears to be a different bug as this occurs with NVIDIA drivers too.

Closing this bug as fixed and opening two new ones for ^^^

manyoso · 2023-10-27T20:47:07Z

#1580 opened which tracks the first character issues
#1581 opened which tracks the problem with falcon model on amdvlk

mrdevolver · 2023-10-31T10:07:39Z

It seems like the GPU is not being used at all. Is it supported at all?

It should be supported. Is it available to select in the UI, and does it report use of the device in the bottom-right while generating output? #1425 is for unsupported GPUs. If you see the hashes, your GPU is being used, but running into a GPT4All bug.

I am able to select the GPU in the list, but it's not being used and reports not enough VRAM when VRAM is actually not being used at all.

Dleewee · 2023-10-31T12:09:43Z

It seems like the GPU is not being used at all. Is it supported at all?

It should be supported. Is it available to select in the UI, and does it report use of the device in the bottom-right while generating output? #1425 is for unsupported GPUs. If you see the hashes, your GPU is being used, but running into a GPT4All bug.

I am able to select the GPU in the list, but it's not being used and reports not enough VRAM when VRAM is actually not being used at all.

It's all or nothing. You need to choose a smaller model that will fit within you 8 GB vram.

harish0201 · 2023-10-31T16:38:37Z

This is definitely weird, because I wasn't able to load a Q4 model of airoboros-l2-13B, shouldn't a quantized model be possible to load in 12GB VRAM? Does the Vulkan backend use more VRAM compared with either say CLBlast or ROCm?

Dleewee · 2023-10-31T17:43:54Z

airoboros-l2-13B

Per: https://sych.io/blog/how-to-run-llama-2-locally-a-guide-to-running-your-own-chatgpt-like-large-language-model/

Here's what's generally recommended:

At least 8 GB of RAM is suggested for the 7B models.
At least 16 GB of RAM for the 13B models.
At least 32 GB of RAM for the 70B models.

However, keep in mind, these are general recommendations. If layers are offloaded to the GPU, it will reduce RAM requirements and use VRAM instead. Please check the specific documentation for the model of your choice to ensure a smooth operation.

harish0201 · 2023-10-31T18:30:40Z

yes, but given gpt4all is using 4bit quantization, that comfortably fits on 12GB VRAM when I use llama.cpp, and uses ~9GB. I've been able to run Q5_K_M quants (uses ~11.2GB) comfortably as well, so this is surprising that gpt4all is probably looking at the number of parameters and mapping those to the memory requirement whilst ignoring the quantization.

I guess this is the reason I asked if Vulkan backend imparts a significant overhead compared to the other operators? (I'm a noob at this, so just trying to understand the differences that the backends make.)

tilkinsc · 2023-10-31T21:07:25Z

I have a 16gb vram 6900xt. It appears mini orca (small) works perfectly. I can in theory load all available models into memory. Falcon, which is pretty small will give me the ;;;;;;;;;;;;;;;;;;;;;;;;;;;;; spam. It appears falcon is 4.1gb and orca mini is 1.9gb in file size. Task manager reports tht my dedicate gpu memory doesn't go over 7.1 with falcon loaded and used as it spams ;;;;;;;;;;;;;;.

This issue may need reopened?

cebtenzzre · 2023-10-31T21:48:36Z

Falcon, which is pretty small will give me the ;;;;;;;;;;;;;;;;;;;;;;;;;;;;; spam.

You are using GPT4All Falcon on Windows? What version of GPT4All?

tilkinsc · 2023-10-31T21:57:43Z

Falcon, which is pretty small will give me the ;;;;;;;;;;;;;;;;;;;;;;;;;;;;; spam.

You are using GPT4All Falcon on Windows? What version of GPT4All?

v2.5.2 windows, the latest available. Since Qt is hell to build.

drivers. Does not have any performance or fidelity effect on other gpu/driver combos I've tested. FIXES: nomic-ai/gpt4all#1507

HyRespt · 2023-12-20T11:12:29Z

Hi, I am still having this issue and also spamming ;;;;;;;;;

I am on latest radeon driver 23.12.1 and latest gpt4all 2.5.4 using gpt4all falcon

apage43 changed the title ~~AMD GPU Misbehavior w/ some drivers~~ AMD GPU Misbehavior w/ some drivers (post GGUF update) Oct 13, 2023

apage43 added the vulkan label Oct 13, 2023

cebtenzzre added this to (Archived) GPT4All 2024 Roadmap and Active Issues Oct 24, 2023

cebtenzzre moved this to Todo in (Archived) GPT4All 2024 Roadmap and Active Issues Oct 24, 2023

manyoso added the bug Something isn't working label Oct 24, 2023

manyoso mentioned this issue Oct 25, 2023

AMD RX 6900 XT is selected but not used #1437

Closed

manyoso added this to the current sprint milestone Oct 25, 2023

cebtenzzre mentioned this issue Oct 26, 2023

kompute : fix validation error in op_mul nomic-ai/llama.cpp#10

Merged

manyoso closed this as completed in nomic-ai/llama.cpp@3414cd8 Oct 27, 2023

github-project-automation bot moved this from Issues TODO to Done in (Archived) GPT4All 2024 Roadmap and Active Issues Oct 27, 2023

manyoso reopened this Oct 27, 2023

manyoso self-assigned this Oct 27, 2023

manyoso closed this as completed Oct 27, 2023

Dleewee mentioned this issue Oct 30, 2023

AMD GPU misbehavior using Python #1594

Closed

2 tasks

cebtenzzre mentioned this issue Dec 20, 2023

Falcon and TinyLlama producing garbage on RX 6700 XT #1768

Closed

AMD GPU Misbehavior w/ some drivers (post GGUF update) #1507

AMD GPU Misbehavior w/ some drivers (post GGUF update) #1507

Comments

apage43 commented Oct 13, 2023 • edited by cebtenzzre Loading

System Info

manyoso commented Oct 13, 2023

mau777pirho commented Oct 13, 2023

shiloh92 commented Oct 14, 2023

PedzacyKapec commented Oct 17, 2023

Dleewee commented Oct 19, 2023 • edited Loading

cebtenzzre commented Oct 24, 2023

mrdevolver commented Oct 24, 2023

cebtenzzre commented Oct 24, 2023

Dleewee commented Oct 25, 2023

manyoso commented Oct 26, 2023

manyoso commented Oct 26, 2023

manyoso commented Oct 26, 2023 • edited Loading

manyoso commented Oct 26, 2023

manyoso commented Oct 26, 2023

manyoso commented Oct 26, 2023

manyoso commented Oct 26, 2023

birkoffe commented Oct 26, 2023

Noremacam commented Oct 26, 2023 • edited Loading

harish0201 commented Oct 26, 2023 • edited Loading

mau777pirho commented Oct 26, 2023

Dleewee commented Oct 26, 2023 • edited Loading

Dleewee commented Oct 27, 2023

cebtenzzre commented Oct 27, 2023

manyoso commented Oct 27, 2023

manyoso commented Oct 27, 2023

manyoso commented Oct 27, 2023 • edited Loading

manyoso commented Oct 27, 2023

harish0201 commented Oct 27, 2023 • edited Loading

birkoffe commented Oct 27, 2023

Dleewee commented Oct 27, 2023 • edited Loading

mau777pirho commented Oct 27, 2023

manyoso commented Oct 27, 2023

manyoso commented Oct 27, 2023 • edited Loading

mrdevolver commented Oct 31, 2023

Dleewee commented Oct 31, 2023

harish0201 commented Oct 31, 2023 • edited Loading

Dleewee commented Oct 31, 2023

harish0201 commented Oct 31, 2023

tilkinsc commented Oct 31, 2023 • edited Loading

cebtenzzre commented Oct 31, 2023

tilkinsc commented Oct 31, 2023 • edited Loading

HyRespt commented Dec 20, 2023

apage43 commented Oct 13, 2023 •

edited by cebtenzzre

Loading

Dleewee commented Oct 19, 2023 •

edited

Loading

manyoso commented Oct 26, 2023 •

edited

Loading

Noremacam commented Oct 26, 2023 •

edited

Loading

harish0201 commented Oct 26, 2023 •

edited

Loading

Dleewee commented Oct 26, 2023 •

edited

Loading

manyoso commented Oct 27, 2023 •

edited

Loading

harish0201 commented Oct 27, 2023 •

edited

Loading

Dleewee commented Oct 27, 2023 •

edited

Loading

manyoso commented Oct 27, 2023 •

edited

Loading

harish0201 commented Oct 31, 2023 •

edited

Loading

tilkinsc commented Oct 31, 2023 •

edited

Loading

tilkinsc commented Oct 31, 2023 •

edited

Loading