-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add documents on how to add new models #65
Labels
Comments
yukavio
pushed a commit
to yukavio/vllm
that referenced
this issue
Jul 3, 2024
SUMMARY: * update "set-env" action to set HF_HOME * add action to mount HF cache * pin some tests to devices "0,1" as this enables about 2k more test points TEST PLAN: runs on remote push --------- Co-authored-by: andy-neuma <andy@neuralmagic.com>
fialhocoelho
pushed a commit
to fialhocoelho/vllm
that referenced
this issue
Jul 8, 2024
Sync with upstream@v0.5.1-10-g16620f43
dllehr-amd
pushed a commit
to dllehr-amd/vllm
that referenced
this issue
Jul 22, 2024
JHLEE17
pushed a commit
to JHLEE17/vllm
that referenced
this issue
Aug 1, 2024
pi314ever
pushed a commit
to pi314ever/vllm
that referenced
this issue
Jan 17, 2025
remove expert_max hard code (vllm-project#47) vLLM-Ext: Full enabling of ALiBi (vllm-project#34) Add version inference via setuptools-scm (vllm-project#58) Revert "vLLM-Ext: Full enabling of ALiBi (vllm-project#34)" (vllm-project#59) Remove punica_hpu.py from vllm_hpu_extension (vllm-project#66) Removed previous (not-pipelined) pa implementation (vllm-project#72) Add flag to enable running softmax in fp32 (vllm-project#71) Update calibration readme link (vllm-project#73) allow lm_head quantization in calibration process (vllm-project#65) Pad to bmin if value is less (vllm-project#67) Update pyproject.toml (HabanaAI#75) --------- Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: