Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: dynamic shared mem moe_align_block_size_kernel #3376

Conversation

akhoroshev
Copy link
Contributor

I encountered compilation errors related to insufficient shared memory size when I tried to increase NUM_MAX_EXPERTS to 128.

@zwd003

@WoosukKwon
Copy link
Collaborator

@pcmoritz Can you take a look? The PR looks good to me.

Copy link
Collaborator

@pcmoritz pcmoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the PR looks good to me, feel free to merge @WoosukKwon !

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhoroshev Thanks for submitting the PR! Left minor comments.

@WoosukKwon
Copy link
Collaborator

Also, just curious: which MoE model are you using? Is there any public model with more than 128 experts?

@akhoroshev
Copy link
Contributor Author

Also, just curious: which MoE model are you using? Is there any public model with more than 128 experts?

This is not a public model

@akhoroshev
Copy link
Contributor Author

@WoosukKwon comments taken into account

Copy link
Collaborator

@WoosukKwon WoosukKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@akhoroshev LGTM! Thanks for the PR!

@WoosukKwon WoosukKwon merged commit 78b6c48 into vllm-project:main Mar 15, 2024
22 of 24 checks passed
simon-mo added a commit to simon-mo/vllm that referenced this pull request Mar 16, 2024
simon-mo added a commit to simon-mo/vllm that referenced this pull request Mar 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants