Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20941] [SQL] Fix SubqueryExec Reuse #18169

Closed
wants to merge 3 commits into from

Conversation

gatorsmile
Copy link
Member

@gatorsmile gatorsmile commented May 31, 2017

What changes were proposed in this pull request?

Before this PR, Subquery reuse does not work. Below are three issues:

  • Subquery reuse does not work.
  • It is sharing the same SQLConf (spark.sql.exchange.reuse) with the one for Exchange Reuse.
  • No test case covers the rule Subquery reuse.

This PR is to fix the above three issues.

  • Ignored the physical operator SubqueryExec when comparing two plans.
  • Added a dedicated conf spark.sql.subqueries.reuse for controlling Subquery Reuse
  • Added a test case for verifying the behavior

How was this patch tested?

N/A

@gatorsmile
Copy link
Member Author

@cloud-fan

@SparkQA
Copy link

SparkQA commented Jun 1, 2017

Test build #77609 has finished for PR 18169 at commit 399a211.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -25,6 +25,7 @@ import org.apache.spark.sql.catalyst.errors._
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions.aggregate._
import org.apache.spark.sql.catalyst.expressions.codegen._
import org.apache.spark.sql.catalyst.plans.QueryPlan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unnecessary change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

@@ -700,6 +703,33 @@ class SQLQuerySuite extends QueryTest with SharedSQLContext {
row => Seq.fill(16)(Row.merge(row, row))).collect().toSeq)
}

test("Verify spark.sql.subquery.reuse") {
Seq("true", "false").foreach { reuse =>
withSQLConf(SQLConf.SUBQUERY_REUSE_ENABLED.key -> reuse) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's more natural to use Seq(true, false) and then do reuse.toString here.

test("Verify spark.sql.subquery.reuse") {
Seq("true", "false").foreach { reuse =>
withSQLConf(SQLConf.SUBQUERY_REUSE_ENABLED.key -> reuse) {
val df = sql(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be good if we also check result of this df

@gatorsmile
Copy link
Member Author

retest this please

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Jun 1, 2017

Test build #77629 has finished for PR 18169 at commit 8095d1b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 1, 2017

Test build #77630 has finished for PR 18169 at commit 21b5c5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 1, 2017

Test build #77633 has finished for PR 18169 at commit 21b5c5b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@asfgit asfgit closed this in f7cf209 Jun 1, 2017
asfgit pushed a commit that referenced this pull request Jun 1, 2017
Before this PR, Subquery reuse does not work. Below are three issues:
- Subquery reuse does not work.
- It is sharing the same `SQLConf` (`spark.sql.exchange.reuse`) with the one for Exchange Reuse.
- No test case covers the rule Subquery reuse.

This PR is to fix the above three issues.
- Ignored the physical operator `SubqueryExec` when comparing two plans.
- Added a dedicated conf `spark.sql.subqueries.reuse` for controlling Subquery Reuse
- Added a test case for verifying the behavior

N/A

Author: Xiao Li <gatorsmile@gmail.com>

Closes #18169 from gatorsmile/subqueryReuse.

(cherry picked from commit f7cf209)
Signed-off-by: Xiao Li <gatorsmile@gmail.com>
@gatorsmile
Copy link
Member Author

Thanks! Merging to master/2.2

@carsonwang
Copy link
Contributor

@gatorsmile , why was this reverted? Are you going to open another PR to fix it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants