muP over depth #49
Unanswered
AlaaKhaddaj
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
As the last TP paper points, muP works (empirically) over depth too. Does the current repo support that? Basically,
set_base_shapes
zips a base model and a wider model, and as such can set thep.infshape
for each paramp
. How can this be adapted to support depth?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions