AsyncLLM engine #1720

aswain571 · 2023-11-19T17:59:05Z

aswain571
Nov 19, 2023

I am trying to use the ASyncLLM engine to stream the output. However, I get <method-wrapper '__getattribute__' of NoneType object at 0x9076e0> when I run this code:

def initialize_model():
global zephyr_model
print("Initializing model and tokenizer...")
login(token = "")
engine_args = AsyncEngineArgs(
model=model_name,
tokenizer=model_name,
engine_use_ray=False,
trust_remote_code=True,
max_num_seqs=1024,
gpu_memory_utilization=0.95,
seed=42)
zephyr_model = AsyncLLMEngine.from_engine_args(engine_args)

zephyr_model = initialize_model()
print (zephyr_model.getattribute)

I would like to test the streaming functionality locally before using it in an endpoint. Anyone solved this issue here?

simon-mo · 2023-11-19T21:43:26Z

simon-mo
Nov 19, 2023
Maintainer

I believe you need to return the model variable from your function?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AsyncLLM engine #1720

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

AsyncLLM engine #1720

aswain571 Nov 19, 2023

Replies: 1 comment

simon-mo Nov 19, 2023 Maintainer

aswain571
Nov 19, 2023

simon-mo
Nov 19, 2023
Maintainer