Skip to content

Commit

Permalink
CUDA: faster softmax via shared memory + fp16 math (ggerganov#4742)
Browse files Browse the repository at this point in the history
  • Loading branch information
JohannesGaessler authored Jan 9, 2024
1 parent 1fc2f26 commit 8f900ab
Show file tree
Hide file tree
Showing 2 changed files with 318 additions and 26 deletions.
Loading

0 comments on commit 8f900ab

Please sign in to comment.