Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49564][SQL] Add SQL pipe syntax for the JOIN operator #48270

Closed
wants to merge 4 commits into from

Conversation

dtenedor
Copy link
Contributor

@dtenedor dtenedor commented Sep 26, 2024

What changes were proposed in this pull request?

This PR adds SQL pipe syntax support for the JOIN operator.

For example:

CREATE TEMPORARY VIEW join_test_t1
  AS SELECT * FROM VALUES (1) AS grouping(a);
CREATE TEMPORARY VIEW join_test_empty_table
  AS SELECT a FROM join_test_t1 WHERE FALSE;

TABLE join_test_t1
|> FULL OUTER JOIN join_test_empty_table
   ON (join_test_t1.a = join_test_empty_table.a);

1	NULL

Why are the changes needed?

The SQL pipe operator syntax will let users compose queries in a more flexible fashion.

Does this PR introduce any user-facing change?

Yes, see above.

How was this patch tested?

This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.

Was this patch authored or co-authored using generative AI tooling?

No

commit
@github-actions github-actions bot added the SQL label Sep 26, 2024
@dtenedor dtenedor changed the title [WIP][SPARK-49564] Add SQL pipe syntax for the JOIN operator [SPARK-49564] Add SQL pipe syntax for the JOIN operator Oct 5, 2024
@dtenedor dtenedor marked this pull request as ready for review October 5, 2024 00:04
@dtenedor
Copy link
Contributor Author

dtenedor commented Oct 5, 2024

cc @gengliangwang @cloud-fan here is the SQL pipe JOIN operator. This is the last "simple" one that is just adding a single "with*" method call that already exists in the AstBuilder.

@dtenedor dtenedor changed the title [SPARK-49564] Add SQL pipe syntax for the JOIN operator [SPARK-49564][SQL] Add SQL pipe syntax for the JOIN operator Oct 5, 2024
@dtenedor
Copy link
Contributor Author

dtenedor commented Oct 7, 2024

the CI failure is not related

@cloud-fan
Copy link
Contributor

The spark connect failure is unrelated, thanks, merging to master!

@cloud-fan cloud-fan closed this in 51af177 Oct 8, 2024
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
### What changes were proposed in this pull request?

This PR adds SQL pipe syntax support for the JOIN operator.

For example:

```
CREATE TEMPORARY VIEW join_test_t1
  AS SELECT * FROM VALUES (1) AS grouping(a);
CREATE TEMPORARY VIEW join_test_empty_table
  AS SELECT a FROM join_test_t1 WHERE FALSE;

TABLE join_test_t1
|> FULL OUTER JOIN join_test_empty_table
   ON (join_test_t1.a = join_test_empty_table.a);

1	NULL
```

### Why are the changes needed?

The SQL pipe operator syntax will let users compose queries in a more flexible fashion.

### Does this PR introduce _any_ user-facing change?

Yes, see above.

### How was this patch tested?

This PR adds a few unit test cases, but mostly relies on golden file test coverage. I did this to make sure the answers are correct as this feature is implemented and also so we can look at the analyzer output plans to ensure they look right as well.

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#48270 from dtenedor/pipe-join.

Authored-by: Daniel Tenedorio <daniel.tenedorio@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants