-
Notifications
You must be signed in to change notification settings - Fork 811
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding GPT-NeoX #164
Adding GPT-NeoX #164
Conversation
Update: I just noticed the missing part and changed that which fixes the old issue but now I get a new error -
|
I did some further testing. It runs perfectly for - https://github.com/aflah02/sglang/blob/main/examples/usage/choices_logprob.py |
@merrymercy Any thoughts? Not sure why one tutorial works while the other doesn't |
|
@merrymercy |
@merrymercy Any thoughts on what might be going wrong here? I don't know whether a template can make such breaking issues |
@aflah02 I have no idea. I typically debug these kinds of wired bugs by comparing intermediate tensors layer by layer between sglang and huggingface/vllm implementations, similar to your print statements. |
@merrymercy Sorry for being inactive, life got really busy the past few months. I don't have the bandwidth nowadays to take this on and if you want to then feel free to work on this |
I will close this for now |
I followed along the instructions here to add GPT-NeoX support which would bring support for the Pythia model family and other similar architecture models.
Reference: #157 (comment)
FIXED (Keeping Logs for Future Reference):
I was able to debug most errors but I'm stuck on this particular error which happens once I start requesting on the endpoint (i.e. it loads correctly I assume) -
Any idea what might be going wrong? It seems that the error is related to the LogitProcessor which I'm not very familiar with. I've tried to copy the logic from the llama implementation for the same