Use pytest for unit tests #58

WoosukKwon · 2023-05-03T02:58:52Z

No description provided.

SUMMARY: * update runner script to use `--forked` for tests in `tests/distributed` * enable all test points in `test_basic_distributed_correctness.py` * pin GPU while running kernel tests. using `CUDA_VISIBLE_DEVICES=0` when running `tests/kernels` and `tests/samplers` TEST PLAN: runs on remote push modulo formatting ... this gets us a bit further ... https://neuralmagic.testmo.net/automation/runs/view/8887 --------- Co-authored-by: andy-neuma <andy@neuralmagic.com>

…t#74) This reverts commit 47c0c5b.

remove expert_max hard code (vllm-project#47) vLLM-Ext: Full enabling of ALiBi (vllm-project#34) Add version inference via setuptools-scm (vllm-project#58) Revert "vLLM-Ext: Full enabling of ALiBi (vllm-project#34)" (vllm-project#59) Remove punica_hpu.py from vllm_hpu_extension (vllm-project#66) Removed previous (not-pipelined) pa implementation (vllm-project#72) Add flag to enable running softmax in fp32 (vllm-project#71) Update calibration readme link (vllm-project#73) allow lm_head quantization in calibration process (vllm-project#65) Pad to bmin if value is less (vllm-project#67) Update pyproject.toml (HabanaAI#75) --------- Co-authored-by: Michał Kuligowski <mkuligowski@habana.ai>

WoosukKwon added the P1 label May 10, 2023

WoosukKwon self-assigned this May 17, 2023

WoosukKwon mentioned this issue May 18, 2023

Use pytest format for kernel tests #107

Merged

WoosukKwon closed this as completed in #107 May 18, 2023

shanshanpt mentioned this issue Nov 17, 2023

Run long conetxt error : CUDA error: an illegal memory access was encountered #1700

Closed

junior-zsy mentioned this issue Nov 20, 2023

Error with 32k Long Text in chatglm2-6b-32k Model #1725

Closed

yuhuixu1993 mentioned this issue Jun 2, 2024

[Bug]: loading squeezellm model #5190

Closed

ZHJ19970917 mentioned this issue Jul 14, 2024

[Bug]: When using qwen-32b-chat-awq with multi-threaded access, errors occur after approximately several hundred visits.”vllm.engine.async_llm_engine.AsyncEngineDeadError: Background loop has errored already.“ #6421

Closed

JHLEE17 pushed a commit to JHLEE17/vllm that referenced this issue Aug 1, 2024

Disable value splitting on G3 (vllm-project#58)

47c0c5b

JHLEE17 pushed a commit to JHLEE17/vllm that referenced this issue Aug 1, 2024

Revert "Disable value splitting on G3 (vllm-project#58)" (vllm-projec…

4a45bbf

…t#74) This reverts commit 47c0c5b.

alixiaodi mentioned this issue Aug 2, 2024

[Bug]: #7072

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use pytest for unit tests #58

Use pytest for unit tests #58

WoosukKwon commented May 3, 2023

Use pytest for unit tests #58

Use pytest for unit tests #58

Comments

WoosukKwon commented May 3, 2023