You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
...
main: number of training tokens: 634519
main: number of unique tokens: 10054
main: train data seems to have changed. restarting shuffled epoch.
main: begin training
main: work_size = 1281928 bytes (1.2 MB)
train_opt_callback: iter= 0 sample=1/4836 sched=0.000000 loss=0.000000 |->
src0->type: 14 dst->type: 0
GGML_ASSERT: ggml-cuda.cu:6193: false
finetune doesn't work with cuda at the moment. it's supposed to dequantize the model weights in the optimizer and for some reason I didn't quite get to the bottom of , it just doesn't happen. #4724
I suspect that supporting that dequantization might be the issue but I had to table the issue. Was hoping someone else could pick it up. I may try again next week if I get time.
Expected Behavior
I built a docker image (with adding #4211) and wanted to do a finetune inside the docker image. Llama.cpp otherwise works in docker for me.
Current Behavior
I ended up with CUDA error 700 at ggml-cuda.cu:6963: an illegal memory access was encountered
Environment and Context
I use multiple GPUs (7 3090s with 24GB VRAM). The model does not fit in one, so I could not try if the problem persists with one device.
I built it like this:
Then run the finetune:
(Tried with different CUDA_VISIBLE_DEVICES setups, such as 0,1,2). This one works with inference using main.
The run looks like this:
The text was updated successfully, but these errors were encountered: