Skip to content

Commit

Permalink
Fix type error in quantize_row_q4_1 for Arm NEON
Browse files Browse the repository at this point in the history
  • Loading branch information
unbounded committed Apr 5, 2023
1 parent 72905b6 commit 944c703
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion ggml.c
Original file line number Diff line number Diff line change
Expand Up @@ -645,7 +645,7 @@ static void quantize_row_q4_0(const float * restrict x, void * restrict vy, int
const float32x4_t v = vmulq_n_f32(srcv[l], id);
const float32x4_t vf = vaddq_f32(v, vdupq_n_f32(8.5f));
const int32x4_t vi = vcvtq_s32_f32(vf);
const int32x4 vc = vminq_u32(vi, vdupq_n_u32(15));
const int32x4_t vc = vminq_s32(vi, vdupq_n_s32(15));

y[i].qs[2*l + 0] = vgetq_lane_s32(vc, 0) | (vgetq_lane_s32(vc, 1) << 4);
y[i].qs[2*l + 1] = vgetq_lane_s32(vc, 2) | (vgetq_lane_s32(vc, 3) << 4);
Expand Down

0 comments on commit 944c703

Please sign in to comment.