-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-3971][SQL] Backport #2843 to branch-1.1 #3113
Conversation
QA tests have started for PR 3113 at commit
|
QA tests have finished for PR 3113 at commit
|
Test FAILed. |
Couldn't reproduce the test failure above, doubt whether it's sensitive to execution order. |
retest this please |
QA tests have started for PR 3113 at commit
|
…riftServer2 `HiveThriftServer2` creates a global singleton `SessionState` instance and overrides `HiveContext` to inject the `SessionState` object. This messes up `SessionState` initialization and causes problems. This PR replaces the global `SessionState` with `HiveContext.sessionState` to avoid the initialization conflict. Also `HiveContext` reuses existing started `SessionState` if any (this is required by `SparkSQLCLIDriver`, which uses specialized `CliSessionState`). Author: Cheng Lian <lian@databricks.com> Closes apache#2887 from liancheng/spark-4037 and squashes the following commits: 8446675 [Cheng Lian] Removes redundant Driver initialization a28fef5 [Cheng Lian] Avoid starting HiveContext.sessionState multiple times 49b1c5b [Cheng Lian] Reuses existing started SessionState if any 3cd6fab [Cheng Lian] Fixes SPARK-4037 Conflicts: sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLEnv.scala sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveContext.scala
As scwf pointed out, `HiveThriftServer2Suite` isn't effective anymore after the Thrift server was made a daemon. On the other hand, these test suites were known flaky, PR apache#2214 tried to fix them but failed because of unknown Jenkins build error. This PR fixes both sets of issues. In this PR, instead of watching `start-thriftserver.sh` output, the test code start a `tail` process to watch the log file. A `Thread.sleep` has to be introduced because the `kill` command used in `stop-thriftserver.sh` is not synchronous. As for the root cause of the mysterious Jenkins build failure. Please refer to [this comment](apache#2675 (comment)) below for details. ---- (Copied from PR description of apache#2214) This PR fixes two issues of `HiveThriftServer2Suite` and brings 1 enhancement: 1. Although metastore, warehouse directories and listening port are randomly chosen, all test cases share the same configuration. Due to parallel test execution, one of the two test case is doomed to fail 2. We caught any exceptions thrown from a test case and print diagnosis information, but forgot to re-throw the exception... 3. When the forked server process ends prematurely (e.g., fails to start), the `serverRunning` promise is completed with a failure, preventing the test code to keep waiting until timeout. So, embarrassingly, this test suite was failing continuously for several days but no one had ever noticed it... Fortunately no bugs in the production code were covered under the hood. Author: Cheng Lian <lian.cs.zju@gmail.com> Author: wangfei <wangfei1@huawei.com> Closes apache#2675 from liancheng/fix-thriftserver-tests and squashes the following commits: 1c384b7 [Cheng Lian] Minor code cleanup, restore the logging level hack in TestHive.scala 7805c33 [wangfei] reset SPARK_TESTING to avoid loading Log4J configurations in testing class paths af2b5a9 [Cheng Lian] Removes log level hacks from TestHiveContext d116405 [wangfei] make sure that log4j level is INFO ee92a82 [Cheng Lian] Relaxes timeout 7fd6757 [Cheng Lian] Fixes test suites in hive-thriftserver Conflicts: sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
The removed `Future` was used to end the test case as soon as the Spark SQL CLI process exits. When the process exits prematurely, this mechanism prevents the test case to wait until timeout. But it also creates a race condition: when `foundAllExpectedAnswers.tryFailure` is called, there are chances that the last expected output line of the CLI process hasn't been caught by the main logics of the test code, thus fails the test case. Removing this `Future` doesn't affect correctness. Author: Cheng Lian <lian.cs.zju@gmail.com> Closes apache#2823 from liancheng/clean-clisuite and squashes the following commits: 489a97c [Cheng Lian] Fixes the race condition that may cause test failure
`CliSuite` has been flaky for a while, this PR tries to improve this situation by fixing a race condition in `CliSuite`. The `captureOutput` function is used to capture both stdout and stderr output of the forked external process in two background threads and search for expected strings, but wasn't been properly synchronized before. Author: Cheng Lian <lian@databricks.com> Closes apache#3060 from liancheng/fix-cli-suite and squashes the following commits: a70569c [Cheng Lian] Fixes race condition in CliSuite
QA tests have finished for PR 3113 at commit
|
Test FAILed. |
@liancheng ping |
@liancheng and I locally test is ok for this. retest this again? |
QA tests have started for PR 3113 at commit
|
QA tests have finished for PR 3113 at commit
|
@liancheng, please refer to #2164, I think there is some relevance |
Author: Michael Armbrust <michael@databricks.com> Closes apache#2164 from marmbrus/shufflePartitions and squashes the following commits: 0da1e8c [Michael Armbrust] test hax ef2d985 [Michael Armbrust] more test hacks. 2dabae3 [Michael Armbrust] more test fixes 0bdbf21 [Michael Armbrust] Make parquet tests less order dependent b42eeab [Michael Armbrust] increase test parallelism 80453d5 [Michael Armbrust] Decrease partitions when testing
…text is created. This will allow us to take advantage of things like the spark.defaults file. Author: Michael Armbrust <michael@databricks.com> Closes apache#2493 from marmbrus/copySparkConf and squashes the following commits: 0bd1377 [Michael Armbrust] Copy SQL configuration from SparkConf when a SQLContext is created. Conflicts: sql/core/src/main/scala/org/apache/spark/sql/test/TestSQLContext.scala Conflicts: sql/core/src/main/scala/org/apache/spark/sql/test/TestSQLContext.scala
72bffb1
to
d354161
Compare
QA tests have started for PR 3113 at commit
|
QA tests have finished for PR 3113 at commit
|
Test PASSed. |
@@ -1551,7 +1550,6 @@ private[spark] object Utils extends Logging { | |||
"%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n") | |||
PropertyConfigurator.configure(pro) | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this file touched only for whitespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these changes were introduced while resolving merge conflicts. Will revert them.
@liancheng the original test cases did not have an order by and yet expected the answers to come out in some order. This broke when we switched the TestSQLContext to I'm going to merge this and try to manually roll back the whitespace changed file in core. |
This PR backports apache#2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns `0.12.0` when `spark.sql.hive.version` is queried. 6 other commits on which apache#2843 depends were also backported, they are: - apache#2887 for `SessionState` lifecycle control - apache#2675, apache#2823 & apache#3060 for major test suite refactoring and bug fixes - apache#2164, for Parquet test suites updates - apache#2493, for reading `spark.sql.*` configurations Author: Cheng Lian <lian@databricks.com> Author: Cheng Lian <lian.cs.zju@gmail.com> Author: Michael Armbrust <michael@databricks.com> Closes apache#3113 from liancheng/get-info-for-1.1 and squashes the following commits: d354161 [Cheng Lian] Provides Spark and Hive version in HiveThriftServer2 for branch-1.1 0c2a244 [Michael Armbrust] [SPARK-3646][SQL] Copy SQL configuration from SparkConf when a SQLContext is created. 3202a36 [Michael Armbrust] [SQL] Decrease partitions when testing 7f395b7 [Cheng Lian] [SQL] Fixes race condition in CliSuite 0dd28ec [Cheng Lian] [SQL] Fixes the race condition that may cause test failure 5928b39 [Cheng Lian] [SPARK-3809][SQL] Fixes test suites in hive-thriftserver faeca62 [Cheng Lian] [SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2
This has been merged, can you close it? |
This PR backports #2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns
0.12.0
whenspark.sql.hive.version
is queried.6 other commits on which #2843 depends were also backported, they are:
SessionState
lifecycle controlspark.sql.*
configurations