Skip to content

Commit

Permalink
Removed the redundant SiLUActivation class. (#27136)
Browse files Browse the repository at this point in the history
* Removed the redundant SiLUActivation class and now use nn.functional.silu directly.

* I apologize for adding torch.functional.silu. I have replaced it with nn.SiLU.
  • Loading branch information
hisushanta authored Nov 2, 2023
1 parent 00d8502 commit 4991216
Showing 1 changed file with 2 additions and 15 deletions.
17 changes: 2 additions & 15 deletions src/transformers/activations.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,19 +137,6 @@ def forward(self, input: Tensor) -> Tensor:
return 0.5 * input * (1 + torch.tanh(self.precomputed_constant * (input + 0.044715 * torch.pow(input, 3))))


class SiLUActivation(nn.Module):
"""
See Gaussian Error Linear Units (Hendrycks et al., https://arxiv.org/abs/1606.08415) where the SiLU (Sigmoid Linear
Unit) was originally introduced and coined, and see Sigmoid-Weighted Linear Units for Neural Network Function
Approximation in Reinforcement Learning (Elfwing et al., https://arxiv.org/abs/1702.03118) and Swish: a Self-Gated
Activation Function (Ramachandran et al., https://arxiv.org/abs/1710.05941v1) where the SiLU was experimented with
later.
"""

def forward(self, input: Tensor) -> Tensor:
return nn.functional.silu(input)


class MishActivation(nn.Module):
"""
See Mish: A Self-Regularized Non-Monotonic Activation Function (Misra., https://arxiv.org/abs/1908.08681). Also
Expand Down Expand Up @@ -226,8 +213,8 @@ def __getitem__(self, key):
"relu2": ReLUSquaredActivation,
"relu6": nn.ReLU6,
"sigmoid": nn.Sigmoid,
"silu": SiLUActivation,
"swish": SiLUActivation,
"silu": nn.SiLU,
"swish": nn.SiLU,
"tanh": nn.Tanh,
}
ACT2FN = ClassInstantier(ACT2CLS)
Expand Down

0 comments on commit 4991216

Please sign in to comment.