You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TOT_CUDA="0,1,2,3"#Upgrade bitsandbytes to the latest version to enable balanced loading of multiple GPUs, for example: pip install bitsandbytes==0.39.0
BASE_MODEL="../models/llama-7B"#"decapoda-research/llama-13b-hf"
LORA_PATH="../output/lora-Vicuna-output-instruction"#"./lora-Vicuna/checkpoint-final"
USE_LOCAL=0 # 1: use local model, 0: use huggingface model
TYPE_WRITER=1 # whether output streamlyif [[ USE_LOCAL -eq 1 ]]
then
cp sample/instruct/adapter_config.json $LORA_PATHfi#Upgrade bitsandbytes to the latest version to enable balanced loading of multiple GPUs
CUDA_VISIBLE_DEVICES=0 python ../generate.py \
--model_path $BASE_MODEL \
--lora_path $LORA_PATH \
--use_local $USE_LOCAL \
--use_typewriter $TYPE_WRITER
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/site-packages/gradio/routes.py", line 384, in run_predict
output = await app.get_blocks().process_api(
File "/root/miniconda3/lib/python3.8/site-packages/gradio/blocks.py", line 1032, in process_api
result = await self.call_function(
File "/root/miniconda3/lib/python3.8/site-packages/gradio/blocks.py", line 858, in call_function
prediction = await anyio.to_thread.run_sync(
File "/root/miniconda3/lib/python3.8/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/root/miniconda3/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/root/miniconda3/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/root/miniconda3/lib/python3.8/site-packages/gradio/utils.py", line 448, in async_iteration
return next(iterator)
File "/root/miniconda3/lib/python3.8/site-packages/gradio/interface.py", line 647, in fn
foroutputin self.fn(*args):
File "../generate.py", line 152, in evaluate
forgeneration_outputin model.stream_generate(
File "/root/Chinese-Vicuna/utils.py", line 657, in stream_beam_search
outputs = self(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 156, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/peft/peft_model.py", line 529, in forward
return self.base_model(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 688, in forward
outputs = self.model(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 578, in forward
layer_outputs = decoder_layer(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 156, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 156, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 194, in forward
query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 156, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/peft/tuners/lora.py", line 358, in forward
result += self.lora_B(self.lora_A(self.lora_dropout(x))) * self.scaling
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/accelerate/hooks.py", line 156, in new_forward
output = old_forward(*args, **kwargs)
File "/root/miniconda3/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: expected scalar type Half but found Float
在提问之前,请务必先看看这个注意事项!!!
已查看,没找到同样的问题。
如果你遇到问题需要我们帮助,你可以从以下角度描述你的信息,以便于我们可以理解或者复现你的错误(学会如何提问不仅是能帮助我们理解你,也是一个自查的过程):
1、你使用了哪个脚本、使用的什么命令
bash generate.sh
2、你的参数是什么(脚本参数、命令参数)
3、你是否修改过我们的代码
只修改过generate.sh的参数,没修改过代码
4、你用的哪个数据集
自己网上抓取的约5700个代码片段,用来训练代码生成
然后你可以从环境的角度描述你的问题,这些问题我们在readme已经相关的问题及解决可能会有描述:
1、哪个操作系统
Ubuntu 20.04.4 LTS
2、使用的什么显卡、多少张
3090单卡
3、python的版本
3.8.10
4、python各种库的版本
然后你也可以从运行的角度来描述你的问题:
1、报错信息是什么,是哪个代码的报错(可以将完整的报错信息都发给我们)
通过Gradio网页,点击submit按钮,请求predict接口时产生报错:
2、GPU、CPU是否工作正常
正常
附上我训练的信息:
训练脚本:bash finetune.sh
另外修改了finetune.py中的如下配置:
The text was updated successfully, but these errors were encountered: