[aot] Fix aot quantization for weight only quantization #2079

tosterberg · 2024-06-17T23:29:54Z

Description

Fixes the quantization path when doing AOT partitioning for weight only quantization strategies, since these do not require any AOT model changes.

If this change is a backward incompatible change, why must this change be made?
Interesting edge cases to note here

…ary#2079)

(cherry picked from commit 00f7412)

[aot] Fix aot quantization for weight only quantization

07f7e74

tosterberg requested review from zachgk, frankfliu and a team as code owners June 17, 2024 23:29

sindhuvahinis approved these changes Jun 17, 2024

View reviewed changes

sindhuvahinis merged commit 00f7412 into deepjavalibrary:master Jun 17, 2024
8 checks passed

tosterberg deleted the fix-neo-quant-neuron branch June 17, 2024 23:42

sindhuvahinis pushed a commit to sindhuvahinis/djl-serving that referenced this pull request Jun 17, 2024

[aot] Fix aot quantization for weight only quantization (deepjavalibr…

8b26687

…ary#2079)

tosterberg added a commit that referenced this pull request Jun 17, 2024

[aot] Fix aot quantization for weight only quantization (#2079)

d509a44

(cherry picked from commit 00f7412)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[aot] Fix aot quantization for weight only quantization #2079

[aot] Fix aot quantization for weight only quantization #2079

tosterberg commented Jun 17, 2024

[aot] Fix aot quantization for weight only quantization #2079

[aot] Fix aot quantization for weight only quantization #2079

Conversation

tosterberg commented Jun 17, 2024

Description