Skip to content

CUDA: faster softmax via shared memory + fp16 math#4742

Merged
JohannesGaessler merged 5 commits intoggerganov:masterfrom JohannesGaessler:cuda-faster-softmaxJan 9, 2024