Skip to content

Commit

Permalink
Fix num_hidden_layers in initialization of new model in Mamba (#30403)
Browse files Browse the repository at this point in the history
Fix num_hidden_layers in initialization

Originally, the initialization was using config.num_layers instead of config.num_hidden_layers. This fixes that.
  • Loading branch information
SrGonao authored May 20, 2024
1 parent 1c2bb3a commit 1834916
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/transformers/models/mamba/modeling_mamba.py
Original file line number Diff line number Diff line change
Expand Up @@ -399,7 +399,7 @@ def _init_weights(self, module):
# Having just p *= scale would repeatedly scale it down
nn.init.kaiming_uniform_(p, a=math.sqrt(5))
with torch.no_grad():
p /= math.sqrt(self.config.num_layers)
p /= math.sqrt(self.config.num_hidden_layers)


@dataclass
Expand Down

0 comments on commit 1834916

Please sign in to comment.