[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113

liancheng · 2014-11-05T15:58:58Z

This PR backports #2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns 0.12.0 when spark.sql.hive.version is queried.

6 other commits on which #2843 depends were also backported, they are:

[SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2 #2887 for SessionState lifecycle control
[SPARK-3809][SQL] Fixes test suites in hive-thriftserver #2675, [SQL] Fixes the race condition that may cause test failure #2823 & [SQL] Fixes race condition in CliSuite #3060 for major test suite refactoring and bug fixes
[SQL] Decrease partitions when testing #2164, for Parquet test suites updates
[SPARK-3646][SQL] Copy SQL configuration from SparkConf when a SQLContext is created. #2493, for reading spark.sql.* configurations

SparkQA · 2014-11-05T16:05:05Z

QA tests have started for PR 3113 at commit 72bffb1.

This patch merges cleanly.

SparkQA · 2014-11-05T17:27:06Z

QA tests have finished for PR 3113 at commit 72bffb1.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-05T17:27:10Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22940/
Test FAILed.

liancheng · 2014-11-06T05:49:14Z

Couldn't reproduce the test failure above, doubt whether it's sensitive to execution order.

liancheng · 2014-11-06T05:49:20Z

retest this please

SparkQA · 2014-11-06T05:52:44Z

QA tests have started for PR 3113 at commit 72bffb1.

This patch merges cleanly.

…riftServer2 `HiveThriftServer2` creates a global singleton `SessionState` instance and overrides `HiveContext` to inject the `SessionState` object. This messes up `SessionState` initialization and causes problems. This PR replaces the global `SessionState` with `HiveContext.sessionState` to avoid the initialization conflict. Also `HiveContext` reuses existing started `SessionState` if any (this is required by `SparkSQLCLIDriver`, which uses specialized `CliSessionState`). Author: Cheng Lian <lian@databricks.com> Closes apache#2887 from liancheng/spark-4037 and squashes the following commits: 8446675 [Cheng Lian] Removes redundant Driver initialization a28fef5 [Cheng Lian] Avoid starting HiveContext.sessionState multiple times 49b1c5b [Cheng Lian] Reuses existing started SessionState if any 3cd6fab [Cheng Lian] Fixes SPARK-4037 Conflicts: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala

As scwf pointed out, `HiveThriftServer2Suite` isn't effective anymore after the Thrift server was made a daemon. On the other hand, these test suites were known flaky, PR apache#2214 tried to fix them but failed because of unknown Jenkins build error. This PR fixes both sets of issues. In this PR, instead of watching `start-thriftserver.sh` output, the test code start a `tail` process to watch the log file. A `Thread.sleep` has to be introduced because the `kill` command used in `stop-thriftserver.sh` is not synchronous. As for the root cause of the mysterious Jenkins build failure. Please refer to [this comment](apache#2675 (comment)) below for details. ---- (Copied from PR description of apache#2214) This PR fixes two issues of `HiveThriftServer2Suite` and brings 1 enhancement: 1. Although metastore, warehouse directories and listening port are randomly chosen, all test cases share the same configuration. Due to parallel test execution, one of the two test case is doomed to fail 2. We caught any exceptions thrown from a test case and print diagnosis information, but forgot to re-throw the exception... 3. When the forked server process ends prematurely (e.g., fails to start), the `serverRunning` promise is completed with a failure, preventing the test code to keep waiting until timeout. So, embarrassingly, this test suite was failing continuously for several days but no one had ever noticed it... Fortunately no bugs in the production code were covered under the hood. Author: Cheng Lian <lian.cs.zju@gmail.com> Author: wangfei <wangfei1@huawei.com> Closes apache#2675 from liancheng/fix-thriftserver-tests and squashes the following commits: 1c384b7 [Cheng Lian] Minor code cleanup, restore the logging level hack in TestHive.scala 7805c33 [wangfei] reset SPARK_TESTING to avoid loading Log4J configurations in testing class paths af2b5a9 [Cheng Lian] Removes log level hacks from TestHiveContext d116405 [wangfei] make sure that log4j level is INFO ee92a82 [Cheng Lian] Relaxes timeout 7fd6757 [Cheng Lian] Fixes test suites in hive-thriftserver Conflicts: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala

The removed `Future` was used to end the test case as soon as the Spark SQL CLI process exits. When the process exits prematurely, this mechanism prevents the test case to wait until timeout. But it also creates a race condition: when `foundAllExpectedAnswers.tryFailure` is called, there are chances that the last expected output line of the CLI process hasn't been caught by the main logics of the test code, thus fails the test case. Removing this `Future` doesn't affect correctness. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes apache#2823 from liancheng/clean-clisuite and squashes the following commits: 489a97c [Cheng Lian] Fixes the race condition that may cause test failure

`CliSuite` has been flaky for a while, this PR tries to improve this situation by fixing a race condition in `CliSuite`. The `captureOutput` function is used to capture both stdout and stderr output of the forked external process in two background threads and search for expected strings, but wasn't been properly synchronized before. Author: Cheng Lian <lian@databricks.com> Closes apache#3060 from liancheng/fix-cli-suite and squashes the following commits: a70569c [Cheng Lian] Fixes race condition in CliSuite

SparkQA · 2014-11-06T07:02:17Z

QA tests have finished for PR 3113 at commit 72bffb1.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-06T07:02:20Z

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22988/
Test FAILed.

marmbrus · 2014-11-07T19:58:09Z

@liancheng ping

scwf · 2014-11-07T20:01:23Z

@liancheng and I locally test is ok for this. retest this again?

SparkQA · 2014-11-07T20:17:56Z

QA tests have started for PR 3113 at commit 72bffb1.

This patch merges cleanly.

SparkQA · 2014-11-07T21:38:55Z

QA tests have finished for PR 3113 at commit 72bffb1.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

scwf · 2014-11-08T00:46:03Z

@liancheng, please refer to #2164, I think there is some relevance

liancheng · 2014-11-08T05:11:03Z

@scwf Thanks for the clue!

@marmbrus Unfortunately I'm traveling today, will continue working on this ASAP tomorrow morning.

Author: Michael Armbrust <michael@databricks.com> Closes apache#2164 from marmbrus/shufflePartitions and squashes the following commits: 0da1e8c [Michael Armbrust] test hax ef2d985 [Michael Armbrust] more test hacks. 2dabae3 [Michael Armbrust] more test fixes 0bdbf21 [Michael Armbrust] Make parquet tests less order dependent b42eeab [Michael Armbrust] increase test parallelism 80453d5 [Michael Armbrust] Decrease partitions when testing

…text is created. This will allow us to take advantage of things like the spark.defaults file. Author: Michael Armbrust <michael@databricks.com> Closes apache#2493 from marmbrus/copySparkConf and squashes the following commits: 0bd1377 [Michael Armbrust] Copy SQL configuration from SparkConf when a SQLContext is created. Conflicts: sql/core/src/main/scala/org/apache/spark/sql/test/TestSQLContext.scala Conflicts: sql/core/src/main/scala/org/apache/spark/sql/test/TestSQLContext.scala

SparkQA · 2014-11-09T09:42:48Z

QA tests have started for PR 3113 at commit d354161.

This patch merges cleanly.

SparkQA · 2014-11-09T10:54:17Z

QA tests have finished for PR 3113 at commit d354161.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2014-11-09T10:54:21Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23117/
Test PASSed.

liancheng · 2014-11-09T11:18:02Z

@marmbrus Backported #2164 to fix the Jenkins build failure (ParquetQuerySuite). Should be ready to go.

liancheng · 2014-11-09T14:23:36Z

@marmbrus However, I didn't quite get why #2164 fixes those Parquet tests. Especially why did you say the original test cases are "order dependent"?

marmbrus · 2014-11-10T19:16:40Z

core/src/main/scala/org/apache/spark/util/Utils.scala

@@ -1551,7 +1550,6 @@ private[spark] object Utils extends Logging {
      "%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n")
    PropertyConfigurator.configure(pro)
  }
-


Is this file touched only for whitespace?

Yes, these changes were introduced while resolving merge conflicts. Will revert them.

marmbrus · 2014-11-11T01:03:18Z

@liancheng the original test cases did not have an order by and yet expected the answers to come out in some order. This broke when we switched the TestSQLContext to local[2].

I'm going to merge this and try to manually roll back the whitespace changed file in core.

This PR backports apache#2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns `0.12.0` when `spark.sql.hive.version` is queried. 6 other commits on which apache#2843 depends were also backported, they are: - apache#2887 for `SessionState` lifecycle control - apache#2675, apache#2823 & apache#3060 for major test suite refactoring and bug fixes - apache#2164, for Parquet test suites updates - apache#2493, for reading `spark.sql.*` configurations Author: Cheng Lian <lian@databricks.com> Author: Cheng Lian <lian.cs.zju@gmail.com> Author: Michael Armbrust <michael@databricks.com> Closes apache#3113 from liancheng/get-info-for-1.1 and squashes the following commits: d354161 [Cheng Lian] Provides Spark and Hive version in HiveThriftServer2 for branch-1.1 0c2a244 [Michael Armbrust] [SPARK-3646][SQL] Copy SQL configuration from SparkConf when a SQLContext is created. 3202a36 [Michael Armbrust] [SQL] Decrease partitions when testing 7f395b7 [Cheng Lian] [SQL] Fixes race condition in CliSuite 0dd28ec [Cheng Lian] [SQL] Fixes the race condition that may cause test failure 5928b39 [Cheng Lian] [SPARK-3809][SQL] Fixes test suites in hive-thriftserver faeca62 [Cheng Lian] [SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2

marmbrus · 2014-11-13T18:50:10Z

This has been merged, can you close it?

liancheng and others added 4 commits November 6, 2014 15:01

marmbrus and others added 3 commits November 9, 2014 16:36

Provides Spark and Hive version in HiveThriftServer2 for branch-1.1

d354161

liancheng force-pushed the get-info-for-1.1 branch from 72bffb1 to d354161 Compare November 9, 2014 09:39

marmbrus reviewed Nov 10, 2014
View reviewed changes

liancheng closed this Nov 13, 2014

liancheng deleted the get-info-for-1.1 branch November 13, 2014 23:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113

[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113

liancheng commented Nov 5, 2014

SparkQA commented Nov 5, 2014

SparkQA commented Nov 5, 2014

AmplabJenkins commented Nov 5, 2014

liancheng commented Nov 6, 2014

liancheng commented Nov 6, 2014

SparkQA commented Nov 6, 2014

SparkQA commented Nov 6, 2014

AmplabJenkins commented Nov 6, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

scwf commented Nov 8, 2014

liancheng commented Nov 8, 2014

SparkQA commented Nov 9, 2014

SparkQA commented Nov 9, 2014

AmplabJenkins commented Nov 9, 2014

liancheng commented Nov 9, 2014

liancheng commented Nov 9, 2014

marmbrus Nov 10, 2014

liancheng Nov 11, 2014

marmbrus commented Nov 11, 2014

marmbrus commented Nov 13, 2014

[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113

[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113

Conversation

liancheng commented Nov 5, 2014

SparkQA commented Nov 5, 2014

SparkQA commented Nov 5, 2014

AmplabJenkins commented Nov 5, 2014

liancheng commented Nov 6, 2014

liancheng commented Nov 6, 2014

SparkQA commented Nov 6, 2014

SparkQA commented Nov 6, 2014

AmplabJenkins commented Nov 6, 2014

marmbrus commented Nov 7, 2014

scwf commented Nov 7, 2014

SparkQA commented Nov 7, 2014

SparkQA commented Nov 7, 2014

scwf commented Nov 8, 2014

liancheng commented Nov 8, 2014

SparkQA commented Nov 9, 2014

SparkQA commented Nov 9, 2014

AmplabJenkins commented Nov 9, 2014

liancheng commented Nov 9, 2014

liancheng commented Nov 9, 2014

marmbrus Nov 10, 2014

Choose a reason for hiding this comment

liancheng Nov 11, 2014

Choose a reason for hiding this comment

marmbrus commented Nov 11, 2014

marmbrus commented Nov 13, 2014