Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introduce materialization support for tables using CTAS to make tables queryable #7085

Merged
merged 26 commits into from
Mar 3, 2021

Conversation

cprasad1
Copy link
Contributor

@cprasad1 cprasad1 commented Feb 24, 2021

Description

Introduces materialization support for tables using CTAS to make tables queryable.
We can make tables queryable by using:
CREATE TABLE QUERYABLE_TABLE AS SELECT...

This would help fix #4614 by adding a more helpful error message as well.

Testing done

  • Unit tests
  • Query translation tests

Reviewer checklist

  • Ensure docs are updated if necessary. (eg. if a user visible feature is being added or changed).
  • Ensure relevant issues are linked (description should include text like "Fixes #")

@cprasad1 cprasad1 requested a review from a team as a code owner February 24, 2021 01:15
@ghost
Copy link

ghost commented Feb 24, 2021

@confluentinc It looks like @cprasad1 just signed our Contributor License Agreement. 👍

Always at your service,

clabot

@cprasad1 cprasad1 changed the title feat: Introduce materialization support for tables using CTAS feat: Introduce materialization support for tables using CTAS to make tables queryable Feb 24, 2021
Copy link
Member

@vvcephei vvcephei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @cprasad1 !

I can't speak to many of the little changes in internal method calls, etc., but at a high level, the error message and the tests look good: exactly what we wanted to achieve.

Minor note: we prefer "queryable" to "queriable".

Also, specifically in the tests, I'm wondering if we can re-structure them to be a little more compact and have better coverage, something like this:

  • Have each case set up a CT or CTAS statement for query in a slightly different way, covering all the possible permutations (like a basic select *, one that includes a WHERE clause, one that includes a more complicated projection, one that does both, one that does a groupBy agg from a stream, one that does a join, etc. etc.).
  • Then, inside each case, we just run multiple pull queries, for each slight variant of pull queries we want to test

In other words, we vary the queriable object between tests and vary the pill queries within each test. That way, we strike a good balance between code coverage and proliferation of test cases.

Copy link
Member

@AlanConfluent AlanConfluent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just had a few small comments, but looks good.

Copy link
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR LGTM, but I had a meta question regarding the description:

CREATE TABLE QUERYABLE_TABLE AS SELECT...

Are we really going to augment the syntax? In this PR it seems we just make CTAT always materialized anyways.

@cprasad1
Copy link
Contributor Author

cprasad1 commented Mar 3, 2021

The PR LGTM, but I had a meta question regarding the description:

CREATE TABLE QUERYABLE_TABLE AS SELECT...

Are we really going to augment the syntax? In this PR it seems we just make CTAT always materialized anyways.

@guozhangwang what are you referring to when you say "augment"? If you were referring to the QUERYABLE_ part of QUERYABLE_TABLE, then that is just the dummy name of the table that is just used as an example here

@guozhangwang
Copy link
Contributor

@guozhangwang what are you referring to when you say "augment"? If you were referring to the QUERYABLE_ part of QUERYABLE_TABLE, then that is just the dummy name of the table that is just used as an example here

My bad, I thought it was some new keywords :P

@cprasad1 cprasad1 requested review from AlanConfluent and agavra March 3, 2021 18:26
Copy link
Member

@vvcephei vvcephei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @cprasad1 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improved error message when querying non-materialized tables
5 participants