Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Speedup of up to 300% on large context (ggerganov#58)
KV cache is now cyclic split into permuted V variant The ggml_tensor_print function has been completely reworked to output proper 1-4dim tensors with data. Example: ``` +======================+======================+======================+======================+ | :0 | V [f32 type] +----------------------+----------------------+----------------------+----------------------+ | Dimensions | Strides | Layer id | Backend | | 3 | 4x16x1024 | 0 | CPU | +----------------------+----------------------+----------------------+----------------------+ | Elements | Src0 | Src1 | Operation | | 4 x 64 x 2 | 4 x 64 x 2 | N/A | CONT | +----------------------+----------------------+----------------------+----------------------+ | Transposed: No | Permuted: No | Contiguous: Yes | Size: 0.00 MB | | Src0 name: | cache_v (view) (permuted) | +----------------------+----------------------+----------------------+----------------------+ +-------------------------------------------------------------------------------------------+ | Content of src0 "cache_v (view) (permuted)" (3 dim) +-------------------------------------------------------------------------------------------+ | Content of src0 "cache_v (view) (permuted)" (3 dim) | Total Elements : [ Row:4 Col:64 Layer:2 ] +-------------------------------------------------------------------------------------------+ | Row 1: [0.302 , 0.010 ] [-0.238 , 0.680 ] [0.305 , 0.206 ] [-0.013 , 0.436 ] [-0.074 , -0.698 ] [-0.153 , -0.067 ] | Row 2: [0.091 , 0.199 ] [0.253 , 0.151 ] [-0.557 , 0.089 ] [0.298 , -0.272 ] [-0.149 , 0.232 ] [-0.217 , 0.193 ] | Row 3: [-0.085 , -0.014 ] [0.225 , 0.089 ] [-0.338 , 0.072 ] [0.416 , -0.186 ] [-0.071 , 0.110 ] [0.467 , 0.497 ] | Row 4: [-0.336 , 0.471 ] [-0.144 , 0.070 ] [-0.062 , 0.520 ] [0.093 , 0.217 ] [-0.332 , -0.205 ] [0.012 , 0.335 ] +-------------------------------------------------------------------------------------------+ +-------------------------------------------------------------------------------------------+ | Content of dst "V" (3 dim) +-------------------------------------------------------------------------------------------+ | Content of dst "V" (3 dim) | Total Elements : [ Row:4 Col:64 Layer:2 ] +-------------------------------------------------------------------------------------------+ | Row 1: [0.302 , 0.010 ] [-0.238 , 0.680 ] [0.305 , 0.206 ] [-0.013 , 0.436 ] [-0.074 , -0.698 ] [-0.153 , -0.067 ] | Row 2: [0.091 , 0.199 ] [0.253 , 0.151 ] [-0.557 , 0.089 ] [0.298 , -0.272 ] [-0.149 , 0.232 ] [-0.217 , 0.193 ] | Row 3: [-0.085 , -0.014 ] [0.225 , 0.089 ] [-0.338 , 0.072 ] [0.416 , -0.186 ] [-0.071 , 0.110 ] [0.467 , 0.497 ] | Row 4: [-0.336 , 0.471 ] [-0.144 , 0.070 ] [-0.062 , 0.520 ] [0.093 , 0.217 ] [-0.332 , -0.205 ] [0.012 , 0.335 ] +-------------------------------------------------------------------------------------------+ +======================+======================+======================+======================+ ```
- Loading branch information