Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add materialized view query optimization utility #16253

Merged
merged 1 commit into from
Jul 12, 2021

Conversation

jzzhaofb-zz
Copy link
Contributor

@jzzhaofb-zz jzzhaofb-zz commented Jun 11, 2021

Adding materialized view optimizer utility to optimize eligible queries if supported by a materialized view. The utility is also responsible for validating the original query against the supported formats.

Currently, this utility is very restrictive and supports some simple query formats. It does not support complex queries (Join, Union, subquery, etc.).

== NO RELEASE NOTE ==

@jzzhaofb-zz jzzhaofb-zz force-pushed the mv-query-optimizer branch 2 times, most recently from cf9b6b4 to d9d6470 Compare June 15, 2021 18:58
@jainxrohit
Copy link
Contributor

Implementing a mechanism which rewrites a base query given a materialized view. Currently it supports simple rewrite (With Alias, Function Call, Where clause, Order by, Having). It does not support join and sub queries.

Test plan: Added unit tests in TestMaterializedViewQueryOptimizer

== NO RELEASE NOTE ==

@jzzhaofb the release note needs to have triple ` at the start and end.

== NO RELEASE NOTE ==

@jzzhaofb-zz jzzhaofb-zz requested a review from jainxrohit June 15, 2021 20:47
Copy link
Contributor

@jainxrohit jainxrohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@jzzhaofb-zz jzzhaofb-zz force-pushed the mv-query-optimizer branch 2 times, most recently from be2d0d6 to cf4bc7f Compare June 17, 2021 04:57
Copy link
Contributor

@jainxrohit jainxrohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall in pretty good shape. Good work!

Copy link
Contributor

@gggrace14 gggrace14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still can't quickly get what types of base query cannot be optimized. Can you add the invalid base query types to TestMaterializedViewQueryOptimizer? A complete unit test suite should include invalid test cases.

@jainxrohit
Copy link
Contributor

jainxrohit commented Jun 17, 2021

I still can't quickly get what types of base query cannot be optimized. Can you add the invalid base query types to TestMaterializedViewQueryOptimizer? A complete unit test suite should include invalid test cases.

@gggrace14 Julian and I discussed about it, the idea for this class is not to cover all non supported cases. We will add MV query validation logic in the candidate extractor. At that time we will add extended set of tests for not supported cases.
The purpose of this class to do basic handling, so that if it reaches in unfamiliar territory, it can fall back on the original query.

Copy link
Contributor

@jainxrohit jainxrohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit requested.

Awesome work!

@highker highker requested a review from yuanzhanhku June 21, 2021 07:29
@highker
Copy link
Contributor

highker commented Jun 21, 2021

Adding @yuanzhanhku to the review as he has a lot of insight how to check if a MV query is a proper subquery of a select query

@highker
Copy link
Contributor

highker commented Jun 21, 2021

we need to properly check a scan-filter-project-agg query containment problem. To decide if we can replace a subquery in the given provided select query, we will need to at least satisfy the following conditions:

  • the provided select query contains all the columns from the columns selected by the MV
  • the provided select query's filter f1 is more restrictive than the one (f2) in the MV. Meaning that f1 implies f2 (https://mathworld.wolfram.com/Implies.html)
  • the project has logically equivalent signatures. For example a * b = b * a
  • the agg functions should match exactly. However, there could be cases where avg = sum/count; this means if a select query contains avg and MV contains sum and count, it is possible to recover avg from it. We probably can ignore this case at the moment.

@jainxrohit
Copy link
Contributor

The PR has lots of comments, lets clean those and keep the relevant ones only.

@jzzhaofb-zz jzzhaofb-zz force-pushed the mv-query-optimizer branch from fcab950 to 2f87a63 Compare July 9, 2021 04:38
@jzzhaofb-zz jzzhaofb-zz requested a review from jainxrohit July 9, 2021 04:38
Copy link
Contributor

@jainxrohit jainxrohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

few minor changes requested.

@jzzhaofb-zz jzzhaofb-zz force-pushed the mv-query-optimizer branch from 2f87a63 to d4c949d Compare July 9, 2021 15:46
@jzzhaofb-zz jzzhaofb-zz requested a review from jainxrohit July 9, 2021 15:53
@jainxrohit
Copy link
Contributor

@jzzhaofb Lets change the commit and PR to following

Title: Add materialized view query optimization utility

Adding materialized view optimizer utility to optimize eligible queries if supported by a materialized view. The utility is also responsible for validating the original query against the supported formats.

Currently, this utility is very restrictive and supports some simple query formats. It does not support complex queries (Join, Union, subquery, etc.).

@jzzhaofb-zz jzzhaofb-zz changed the title Add query optimizer using materialized view Add materialized view query optimization utility Jul 9, 2021
@jzzhaofb-zz jzzhaofb-zz force-pushed the mv-query-optimizer branch from d4c949d to cc7deb8 Compare July 9, 2021 21:37
Copy link
Contributor

@jainxrohit jainxrohit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@highker please have a look.

@highker highker requested a review from NikhilCollooru July 10, 2021 01:01
Copy link
Contributor

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly nits

@highker highker self-assigned this Jul 12, 2021
Adding materialized view optimizer utility to optimize eligible queries if
supported by a materialized view. The utility is also responsible for
validating the original query against the supported formats. Currently,
this utility is very restrictive and supports some simple query formats.
It does not support complex queries (Join, Union, subquery, etc.).
@highker highker merged commit ab404ca into prestodb:master Jul 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants