Skip to content

Commit

Permalink
CUDA: faster softmax via shared memory + fp16 math
Browse files Browse the repository at this point in the history
  • Loading branch information
JohannesGaessler committed Jan 3, 2024
1 parent 540938f commit 64c46fc
Show file tree
Hide file tree
Showing 2 changed files with 285 additions and 24 deletions.
Loading

0 comments on commit 64c46fc

Please sign in to comment.