[Bugfix] update neuron for version > 0.5.0 #7175

omrishiv · 2024-08-05T23:58:14Z

FILL IN THE PR DESCRIPTION HERE

This is the first of multiple PRs to address some neuron issues.

vLLM >= 0.5.1 refactored the WorkerBase around. This adds the missing abstract method execute_worker that was failing, while also addressing a few other inconsistencies. This PR also expands the block-size choices due to neuron needing block-size = sequence-length. At the moment, tensor parallelism/model parallelism are not supported. That will come in the next PR.

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

github-actions · 2024-08-05T23:58:24Z

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which consists a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of default ones by unblocking the steps in your fast-check build on Buildkite UI.

Once the PR is approved and ready to go, please make sure to run full CI as it is required to merge (or just use auto-merge).

To run full CI, you can do one of these:

Comment /ready on the PR
Add ready label to the PR
Enable auto-merge.

🚀

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

liangfu

Thank you @omrishiv for fixing NeuronWorker issue. The proposed fix looks good.

Would you rebase the latest main branch (, which tests against neuron sdk 2.19 since #6832), and help triage the CI/CD issue as well?

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

omrishiv · 2024-08-14T17:03:18Z

@liangfu I merged in main and am now failing tests, is this the CI/CD issue you mean?
Edit: apparently I hadn't properly merged in main. Now I'm not seeing the neuron test issue

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

congcongchen123 · 2024-08-28T18:26:07Z

vllm/engine/arg_utils.py

@@ -293,7 +293,7 @@ def add_cli_args(parser: FlexibleArgumentParser) -> FlexibleArgumentParser:
        parser.add_argument('--block-size',
                            type=int,
                            default=EngineArgs.block_size,
-                            choices=[8, 16, 32],
+                            choices=[8, 16, 32, 128, 256, 512, 1024, 2048],


I am curious why we skip 64? Is it for a specific reason? Thanks

This may have been an oversight during testing, but it's been addressed by #7562

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Signed-off-by: Alvant <alvasian@yandex.ru>

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

update neuron for 0.5.1

caf5032

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

format

bf47bc6

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

youkaichao requested a review from liangfu August 6, 2024 00:23

yapf

418b5ae

njhill added the aws-neuron Related to AWS Inferentia & Trainium label Aug 6, 2024

liangfu approved these changes Aug 14, 2024

View reviewed changes

Merge branch 'main' into neuron-6269-pt-1

1a621fa

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

Merge branch 'main' into neuron-6269-pt-1

440055e

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com>

simon-mo merged commit 9c1f78d into vllm-project:main Aug 15, 2024
28 checks passed

omrishiv mentioned this pull request Aug 15, 2024

[Bugfix] neuron: enable tensor parallelism #7562

Merged

omrishiv deleted the neuron-6269-pt-1 branch August 15, 2024 17:33

kylesayrs pushed a commit to neuralmagic/vllm that referenced this pull request Aug 17, 2024

[Bugfix] update neuron for version > 0.5.0 (vllm-project#7175)

c1402c3

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

fialhocoelho pushed a commit to opendatahub-io/vllm that referenced this pull request Aug 22, 2024

[Bugfix] update neuron for version > 0.5.0 (vllm-project#7175)

290c598

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

congcongchen123 reviewed Aug 28, 2024

View reviewed changes

KuntaiDu pushed a commit to KuntaiDu/vllm that referenced this pull request Nov 20, 2024

[Bugfix] update neuron for version > 0.5.0 (vllm-project#7175)

e8a5fd7

Signed-off-by: omrishiv <327609+omrishiv@users.noreply.github.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bugfix] update neuron for version > 0.5.0 #7175

[Bugfix] update neuron for version > 0.5.0 #7175

omrishiv commented Aug 5, 2024

github-actions bot commented Aug 5, 2024

liangfu left a comment

omrishiv commented Aug 14, 2024 •

edited

Loading

congcongchen123 Aug 28, 2024

omrishiv Aug 30, 2024

[Bugfix] update neuron for version > 0.5.0 #7175

[Bugfix] update neuron for version > 0.5.0 #7175

Conversation

omrishiv commented Aug 5, 2024

github-actions bot commented Aug 5, 2024

liangfu left a comment

Choose a reason for hiding this comment

omrishiv commented Aug 14, 2024 • edited Loading

congcongchen123 Aug 28, 2024

Choose a reason for hiding this comment

omrishiv Aug 30, 2024

Choose a reason for hiding this comment

omrishiv commented Aug 14, 2024 •

edited

Loading