Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请教关于算法原理 #1332

Open
elesun2018 opened this issue Dec 4, 2024 · 1 comment
Open

请教关于算法原理 #1332

elesun2018 opened this issue Dec 4, 2024 · 1 comment
Labels

Comments

@elesun2018
Copy link

请问Qwen大模型中数据shape次序是什么样的。
[s, b, h] 形状还是[b, s, h] 形状
为什么要这样设计
hidden_states: 输入到这一层的隐藏状态张量,形状为 [s, b, h],其中 s 是序列长度,b 是批量大小,h 是隐藏层维度。
谢谢

Copy link

github-actions bot commented Jan 4, 2025

This issue has been automatically marked as inactive due to lack of recent activity. Should you believe it remains unresolved and warrants attention, kindly leave a comment on this thread.
此问题由于长期未有新进展而被系统自动标记为不活跃。如果您认为它仍有待解决,请在此帖下方留言以补充信息。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant