Skip to content

what is the purpose of ggml_half2 dm in quantization structure? #887

Answered by ggerganov
PenutChen asked this question in Q&A
Discussion options

You must be logged in to vote

It's used in GPU code that supports half2 ops. For example in CUDA:

https://github.com/ggerganov/ggml/blob/a3c0188a4b5d3dec052ff87c9f773baa53631d70/src/ggml-cuda/fattn-common.cuh#L120

Since this is a union, the dm member is basically an alias for the d and dmin factors

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@PenutChen
Comment options

Answer selected by PenutChen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants