AsyncLLM engine
#1720
Replies: 1 comment
-
I believe you need to return the model variable from your function? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am trying to use the ASyncLLM engine to stream the output. However, I get
<method-wrapper '__getattribute__' of NoneType object at 0x9076e0>
when I run this code:def initialize_model():
global zephyr_model
print("Initializing model and tokenizer...")
login(token = "")
engine_args = AsyncEngineArgs(
model=model_name,
tokenizer=model_name,
engine_use_ray=False,
trust_remote_code=True,
max_num_seqs=1024,
gpu_memory_utilization=0.95,
seed=42)
zephyr_model = AsyncLLMEngine.from_engine_args(engine_args)
zephyr_model = initialize_model()
print (zephyr_model.getattribute)
I would like to test the streaming functionality locally before using it in an endpoint. Anyone solved this issue here?
Beta Was this translation helpful? Give feedback.
All reactions