[Snippets][CPU] Added KVCacheMatcher check for LLM in MHATokenization #28812

a-sidorova · 2025-02-04T10:39:35Z

Details:

Extracted is_LLM check from GPU plugin to common part to reuse in other plugins
Used extracted is_large_language_model function in check for LLM in MHATokenization in CPU Plugin

Tickets:

CVS-160999

a-sidorova · 2025-02-14T05:31:46Z

@itikhono @sshlyapn guys, could you take a look please from GPU Plugin and Transformation side? 😊

src/common/transformations/src/transformations/utils/utils.cpp

IvanNovoselov · 2025-02-19T15:11:44Z

src/plugins/intel_cpu/src/transformations/transformation_pipeline.cpp

+                        ov::op::util::has_op_with_type<intel_cpu::ScaledDotProductAttentionWithKVCache>(model) ||
                        ov::op::util::has_op_with_type<ov::op::PagedAttentionExtension>(model);


Do we still need these checks here?
If we do, can move them inside is_large_language_model? Since we created a dedicated helper for this check, all the logic related should be inside. What do you think?

At first time I just wanted to align with GPU check - I was not sure that PA should be in the check is_llm in GPU Plugin. But I found that they updated the condition included PA. So I moved PA to common helper leaving only SDPAWithKVCache because it's CPU-specific op.
Thanks!

…callback condition

github-actions bot added the category: CPU OpenVINO CPU plugin label Feb 4, 2025

a-sidorova force-pushed the feature/snippets/is_llm_check_in_callback branch from 6fedb81 to 6cf3934 Compare February 13, 2025 13:39

github-actions bot added category: GPU OpenVINO GPU plugin category: transformations OpenVINO Runtime library - Transformations labels Feb 13, 2025

a-sidorova marked this pull request as ready for review February 14, 2025 05:28

a-sidorova requested review from a team as code owners February 14, 2025 05:28

a-sidorova requested review from itikhono and removed request for a team February 14, 2025 05:28

a-sidorova assigned IvanNovoselov Feb 14, 2025

a-sidorova added this to the 2025.1 milestone Feb 14, 2025

sshlyapn approved these changes Feb 17, 2025

View reviewed changes

itikhono reviewed Feb 18, 2025

View reviewed changes

src/common/transformations/src/transformations/utils/utils.cpp Outdated Show resolved Hide resolved

itikhono reviewed Feb 18, 2025

View reviewed changes

src/common/transformations/src/transformations/utils/utils.cpp Show resolved Hide resolved

itikhono reviewed Feb 18, 2025

View reviewed changes

src/common/transformations/src/transformations/utils/utils.cpp Outdated Show resolved Hide resolved

a-sidorova force-pushed the feature/snippets/is_llm_check_in_callback branch from 6cf3934 to 0c75162 Compare February 19, 2025 11:32

github-actions bot removed the category: GPU OpenVINO GPU plugin label Feb 19, 2025

a-sidorova requested a review from itikhono February 19, 2025 11:35

itikhono approved these changes Feb 19, 2025

View reviewed changes

IvanNovoselov reviewed Feb 19, 2025

View reviewed changes

a-sidorova force-pushed the feature/snippets/is_llm_check_in_callback branch from 0c75162 to 3c8a73c Compare February 24, 2025 05:50

a-sidorova added 5 commits February 24, 2025 12:07

[CPU][GPU] Extracted is_LLM check to common part and update snippets …

1b23674

…callback condition

[Transformations] Applied Ivan comments

cba8935

Fixed code style

29330d5

Moved PA to check

dad115b

Reused helper in GPU Plugin

782ed19

a-sidorova force-pushed the feature/snippets/is_llm_check_in_callback branch from 3c8a73c to 782ed19 Compare February 24, 2025 08:58

github-actions bot added the category: GPU OpenVINO GPU plugin label Feb 24, 2025

a-sidorova requested a review from IvanNovoselov February 24, 2025 09:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Snippets][CPU] Added KVCacheMatcher check for LLM in MHATokenization #28812

[Snippets][CPU] Added KVCacheMatcher check for LLM in MHATokenization #28812

a-sidorova commented Feb 4, 2025 •

edited

Loading

a-sidorova commented Feb 14, 2025

IvanNovoselov Feb 19, 2025

a-sidorova Feb 24, 2025

		ov::op::util::has_op_with_type<intel_cpu::ScaledDotProductAttentionWithKVCache>(model) \|\|
		ov::op::util::has_op_with_type<ov::op::PagedAttentionExtension>(model);

[Snippets][CPU] Added KVCacheMatcher check for LLM in MHATokenization #28812

Are you sure you want to change the base?

[Snippets][CPU] Added KVCacheMatcher check for LLM in MHATokenization #28812

Conversation

a-sidorova commented Feb 4, 2025 • edited Loading

Details:

Tickets:

a-sidorova commented Feb 14, 2025

IvanNovoselov Feb 19, 2025

Choose a reason for hiding this comment

a-sidorova Feb 24, 2025

Choose a reason for hiding this comment

a-sidorova commented Feb 4, 2025 •

edited

Loading