You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
after adding outlines, running openai api server has error:
root@dualamd:/app# python -m vllm.entrypoints.openai.api_server \
> --gpu-memory-utilization 0.98 --trust-remote-code \
> --served-model-name gpt-3.5-turbo-1106 \
> --max-model-len 32768 --model Qwen1.5-14B
Traceback (most recent call last):
File "/opt/conda/envs/py_3.9/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/conda/envs/py_3.9/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/vllm-0.3.3+rocm603-py3.9-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line 23, in <module>
from vllm.entrypoints.openai.serving_chat import OpenAIServingChat
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/vllm-0.3.3+rocm603-py3.9-linux-x86_64.egg/vllm/entrypoints/openai/serving_chat.py", line 15, in <module>
from vllm.model_executor.guided_decoding import get_guided_decoding_logits_processor
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/vllm-0.3.3+rocm603-py3.9-linux-x86_64.egg/vllm/model_executor/guided_decoding.py", line 12, in <module>
from vllm.model_executor.guided_logits_processors import JSONLogitsProcessor, RegexLogitsProcessor
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/vllm-0.3.3+rocm603-py3.9-linux-x86_64.egg/vllm/model_executor/guided_logits_processors.py", line 23, in <module>
from outlines.fsm.fsm import RegexFSM
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/outlines/__init__.py", line 2, in <module>
import outlines.generate
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/outlines/generate/__init__.py", line 1, in <module>
from .api import SequenceGenerator
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/outlines/generate/api.py", line 5, in <module>
from outlines.fsm.fsm import FSMState
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/outlines/fsm/fsm.py", line 9, in <module>
from outlines.fsm.regex import create_fsm_index_tokenizer, make_deterministic_fsm
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/outlines/fsm/regex.py", line 5, in <module>
import numba
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numba/__init__.py", line 43, in <module>
from numba.np.ufunc import (vectorize, guvectorize, threading_layer,
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numba/np/ufunc/__init__.py", line 3, in <module>
from numba.np.ufunc.decorators import Vectorize, GUVectorize, vectorize, guvectorize
File "/opt/conda/envs/py_3.9/lib/python3.9/site-packages/numba/np/ufunc/decorators.py", line 3, in <module>
from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception
it seems numba only supports CUDA but not ROCM.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
requirements-rocm.txt missing outlines:
after adding outlines, running openai api server has error:
it seems numba only supports CUDA but not ROCM.
The text was updated successfully, but these errors were encountered: