Skip to content

Commit

Permalink
Change sing to KL div accordingly to issue #298
Browse files Browse the repository at this point in the history
  • Loading branch information
PierpaoloSorbellini committed Apr 3, 2023
1 parent 8332a26 commit 32ddfa2
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion apps/accelerate/chatllama/chatllama/rlhf/trainer.py
Original file line number Diff line number Diff line change
Expand Up @@ -584,7 +584,7 @@ def learn(self, memories: Deque[Memory]) -> None:

# compute KL divergence
kl_div_loss = (
(actions_prob * (old_actions_log_probs - actions_log_prob))
(actions_prob * (actions_log_prob - old_actions_log_probs))
.sum(dim=-1)
.mean()
)
Expand Down

0 comments on commit 32ddfa2

Please sign in to comment.