Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type Simple SFW queries using current LogicalPlan as an IR #399

Merged
merged 37 commits into from
Sep 14, 2023

Conversation

am357
Copy link
Contributor

@am357 am357 commented Jun 26, 2023

Description of changes:
Relates to: partiql/partiql-lang#18

Adds the code for typing a query using the current LogicalPlan as an IR. This PR includes the changes to enable simple SFW queries as the following in both Strict and Permissive typing modes with Open and Closed schemas.

Closed schema with Strict typing mode.

-- customers: <<{'id': INT, 'name': STRING, age: ANY}>>
SELECT customers.id, customers.name FROM customers;

-- Output schema: <<{'id': INT, 'name': STRING}>>

Open schema with Strict typing mode and age non-existent projection.

-- customers: <<{'id': INT, 'name': STRING}>>
SELECT customers.id, customers.name, customers.age FROM customers;

-- Output schema: <<{'id': INT, 'name': STRING, 'age': ANY}>>

Closed Schema with Permissive typing mode and age non-existent projection.

-- customers: <<{'id': INT, 'name': STRING>>
SELECT customers.id, customers.name, customers.age FROM customers;


-- Output schema: <<{'id': INT, 'name': STRING, 'age': NULL}>>

See: partiql/partiql-spec#64

Open Schema with Strict typing mode and age in nested attribute.

-- customers: <<{'id': INT, 'name': STRING, 'details': {'age': INT}}>>
SELECT customers.id, customers.name, customers.details.age FROM customers;


-- Output schema: <<{'id': INT, 'name': STRING, 'age': INT}>>

Open Schema (customers and details) with Strict typing mode.

-- customers: <<{'id': INT, 'name': STRING, 'details': {'age': INT}}>>
SELECT customers.id, customers.name, customers.details.age, customers.details.foo.bar FROM customers;


-- Output schema: <<{'id': INT, 'name': STRING, 'age': INT, 'bar': ANY}>>

Open Schema (customers and details) with Strict typing mode and age in nested attribute with Path in FROM with alias.

-- customers: <<{'id': INT, 'name': STRING, 'details': {'age': INT}}>>
SELECT d.age FROM customers.details AS d;


-- Output schema: <<{'age': INT}>>

See: partiql/partiql-spec#65

Closed Schema with Strict typing mode with FROM and Projection aliases.

-- customers: <<{'id': INT, 'name': STRING, 'age': ANY}>>
SELECT c.id AS my_id, customers.name AS my_name FROM customers AS c;


-- Output schema: <<{'my_id': INT, 'my_name': STRING}>>

In addition:

  • fixes some issues with the current partiql_types macros, changes the model for some of the types, and adds some helper methods to PartiqlType.
  • moves constraints to set
  • uses BTreeSet for constraints and AnyOf to be able to support ordering on the these types; since we're not expecting large number of elements for these types the performance hit seems negligible.

Upon merging this PR we need to add:

  • Support for the rest of the Binding Operators (E.g., Filter, OrderBy).
  • Support for operators (E.g., unary, binary).
  • Support for functions.
  • Related Conformance Tests.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@github-actions
Copy link

github-actions bot commented Jun 26, 2023

Conformance comparison report

Base (14f6e01) 1b29eb4 +/-
% Passing 90.35% 90.35% 0.00%
✅ Passing 5731 5731 0
❌ Failing 612 612 0
🔶 Ignored 0 0 0
Total Tests 6343 6343 0

Number passing in both: 5731

Number failing in both: 612

Number passing in Base (14f6e01) but now fail: 0

Number failing in Base (14f6e01) but now pass: 0

@codecov
Copy link

codecov bot commented Jun 26, 2023

Codecov Report

Patch coverage: 82.11% and project coverage change: +0.11% 🎉

Comparison is base (14f6e01) 81.80% compared to head (85f290e) 81.92%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #399      +/-   ##
==========================================
+ Coverage   81.80%   81.92%   +0.11%     
==========================================
  Files          62       63       +1     
  Lines       15991    16728     +737     
  Branches    15991    16728     +737     
==========================================
+ Hits        13082    13705     +623     
- Misses       2373     2470      +97     
- Partials      536      553      +17     
Files Changed Coverage Δ
partiql-ast-passes/src/name_resolver.rs 83.63% <ø> (ø)
partiql-logical-planner/src/lib.rs 94.31% <ø> (ø)
partiql-catalog/src/lib.rs 69.33% <42.85%> (-0.81%) ⬇️
partiql-types/src/lib.rs 75.36% <72.52%> (+6.61%) ⬆️
partiql-logical-planner/src/typer.rs 83.57% <83.57%> (ø)
partiql-eval/src/eval/eval_expr_wrapper.rs 83.97% <100.00%> (ø)
partiql-logical-planner/src/lower.rs 87.13% <100.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@am357 am357 force-pushed the feat-type-plan-poc branch from 4e5aa56 to adc220d Compare June 26, 2023 22:06
@am357 am357 changed the title [WIP] Steel thread for typing the plan (for simple SFW query) Type Simple SFW queries using current LogicalPlan as an IR Jul 12, 2023
@am357 am357 force-pushed the feat-type-plan-poc branch 2 times, most recently from bcf1f3a to 1c9bba9 Compare July 13, 2023 00:53
@am357 am357 force-pushed the feat-type-plan-poc branch from 1c9bba9 to 97b5280 Compare July 13, 2023 01:05
@am357 am357 force-pushed the feat-type-plan-poc branch 3 times, most recently from 636cf13 to 6b54416 Compare July 13, 2023 18:49
@am357 am357 force-pushed the feat-type-plan-poc branch from 6b54416 to b645dca Compare July 13, 2023 19:02
@am357 am357 marked this pull request as ready for review July 13, 2023 19:05
@am357 am357 requested review from jpschorr and alancai98 July 13, 2023 19:05
@am357 am357 force-pushed the feat-type-plan-poc branch 2 times, most recently from 8139cce to 0cf33db Compare August 29, 2023 18:12
@am357 am357 force-pushed the feat-type-plan-poc branch from 0cf33db to b1f6084 Compare August 29, 2023 18:42
Copy link
Member

@alancai98 alancai98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a few more clarification questions. Otherwise should be ready to approve

@am357 am357 force-pushed the feat-type-plan-poc branch from 12ada2e to df1405e Compare September 1, 2023 21:51
Copy link
Member

@alancai98 alancai98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying some things. Looks good and we can always adjust as we implement typing for other binding ops and expressions. Noticed one remaining typo.

@alancai98
Copy link
Member

Seems like a rustfmt check is still failing: https://github.com/partiql/partiql-lang-rust/actions/runs/6055593593/job/16434699935

@am357 am357 force-pushed the feat-type-plan-poc branch from df73721 to df1405e Compare September 6, 2023 21:20
@jpschorr
Copy link
Contributor

-- Output schema: <<{'id': INT, 'name': STRING, 'age': MISSING}>>

Should this perhaps use NULL instead of MISSING? For values, Missing is not allowed as the value of a struct's field, as Missing implies the field is not present.

@jpschorr
Copy link
Contributor

-- customers: <<{'id': INT, 'name': STRING, 'details': {'age': INT}}>>
SELECT d.age FROM customers.details AS d;


-- Output schema: <<{'age': INT}>>```

The age in the output is 'inferred' from final pathing in the SELECT d.age, which matches the type of column name inference that many RDBMSs implement. We'll have to think about the inference rules for, e.g., SELECT d.age +1

The PartiQL spec 6.3.1 is relevant here: https://partiql.org/assets/PartiQL-Specification.pdf#subsection.6.3

@am357
Copy link
Contributor Author

am357 commented Sep 14, 2023

-- Output schema: <<{'id': INT, 'name': STRING, 'age': MISSING}>>

Should this perhaps use NULL instead of MISSING? For values, Missing is not allowed as the value of a struct's field, as Missing implies the field is not present.

That's a good call-out. Another option is to elide the attribute from the schema in permissive mode. As discussed offline will add a TODO in the code and spec. to get back to this later.

@am357
Copy link
Contributor Author

am357 commented Sep 14, 2023

SELECT d.age +1

@am357 am357 closed this Sep 14, 2023
@am357 am357 reopened this Sep 14, 2023
@am357
Copy link
Contributor Author

am357 commented Sep 14, 2023

-- Output schema: <<{'id': INT, 'name': STRING, 'age': MISSING}>>

Should this perhaps use NULL instead of MISSING? For values, Missing is not allowed as the value of a struct's field, as Missing implies the field is not present.

I created the following spec. discussion that aims to address this:
partiql/partiql-spec#64

@am357
Copy link
Contributor Author

am357 commented Sep 14, 2023

-- customers: <<{'id': INT, 'name': STRING, 'details': {'age': INT}}>>
SELECT d.age FROM customers.details AS d;


-- Output schema: <<{'age': INT}>>```

The age in the output is 'inferred' from final pathing in the SELECT d.age, which matches the type of column name inference that many RDBMSs implement. We'll have to think about the inference rules for, e.g., SELECT d.age +1

The PartiQL spec 6.3.1 is relevant here: https://partiql.org/assets/PartiQL-Specification.pdf#subsection.6.3

I created the following GitHub discussion that aims to address this:
partiql/partiql-spec#65

@am357
Copy link
Contributor Author

am357 commented Sep 14, 2023

-- Output schema: <<{'id': INT, 'name': STRING, 'age': MISSING}>>

Should this perhaps use NULL instead of MISSING? For values, Missing is not allowed as the value of a struct's field, as Missing implies the field is not present.

That's a good call-out. Another option is to elide the attribute from the schema in permissive mode. As discussed offline will add a TODO in the code and spec. to get back to this later.

I went with the NULL for now pending partiql/partiql-spec#64 also added a TODO in the code.

@am357 am357 merged commit fbaac0f into main Sep 14, 2023
@am357 am357 deleted the feat-type-plan-poc branch September 14, 2023 23:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants