-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SQL] Decrease partitions when testing #2164
Conversation
|
||
/** Fewer partitions to speed up testing. */ | ||
override private[spark] def numShufflePartitions: Int = | ||
getConf(SQLConf.SHUFFLE_PARTITIONS, "5").toInt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know... I was thinking a little more parallelism might be more likely to find bugs without incurring too much overhead. I could be convinced otherwise...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there isn't any parallelism really here since we run with "local", which is single threaded. increasing this from 2 to 5 simply breaks each dataset into more chunks, to be processed sequentially.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess parallelism isn't really what I meant, I'm thinking more about bugs that could be related to expecting data to be copartitioned when it actually isn't.
That said, perhaps the test should also run in local[2]
or higher. We have found a couple of bugs after deploying that are the result of concurrency issues (scala reflection... i'm looking at you :P)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to run this using local[2]
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
ok to test |
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
Oops some parquet tests failed |
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
Jenkins, retest this please. |
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
b035325
to
ee687cd
Compare
Jenkins, test this please |
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
ee687cd
to
0bcaafa
Compare
QA tests have started for PR 2164 at commit
|
0bcaafa
to
dc7cb6e
Compare
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
dc7cb6e
to
2dabae3
Compare
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
QA tests have started for PR 2164 at commit
|
QA tests have started for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
QA tests have finished for PR 2164 at commit
|
I'm going to merge this to avoid more test timeouts. |
Author: Michael Armbrust <michael@databricks.com> Closes apache#2164 from marmbrus/shufflePartitions and squashes the following commits: 0da1e8c [Michael Armbrust] test hax ef2d985 [Michael Armbrust] more test hacks. 2dabae3 [Michael Armbrust] more test fixes 0bdbf21 [Michael Armbrust] Make parquet tests less order dependent b42eeab [Michael Armbrust] increase test parallelism 80453d5 [Michael Armbrust] Decrease partitions when testing
This PR backports apache#2843 to branch-1.1. The key difference is that this one doesn't support Hive 0.13.1 and thus always returns `0.12.0` when `spark.sql.hive.version` is queried. 6 other commits on which apache#2843 depends were also backported, they are: - apache#2887 for `SessionState` lifecycle control - apache#2675, apache#2823 & apache#3060 for major test suite refactoring and bug fixes - apache#2164, for Parquet test suites updates - apache#2493, for reading `spark.sql.*` configurations Author: Cheng Lian <lian@databricks.com> Author: Cheng Lian <lian.cs.zju@gmail.com> Author: Michael Armbrust <michael@databricks.com> Closes apache#3113 from liancheng/get-info-for-1.1 and squashes the following commits: d354161 [Cheng Lian] Provides Spark and Hive version in HiveThriftServer2 for branch-1.1 0c2a244 [Michael Armbrust] [SPARK-3646][SQL] Copy SQL configuration from SparkConf when a SQLContext is created. 3202a36 [Michael Armbrust] [SQL] Decrease partitions when testing 7f395b7 [Cheng Lian] [SQL] Fixes race condition in CliSuite 0dd28ec [Cheng Lian] [SQL] Fixes the race condition that may cause test failure 5928b39 [Cheng Lian] [SPARK-3809][SQL] Fixes test suites in hive-thriftserver faeca62 [Cheng Lian] [SPARK-4037][SQL] Removes the SessionState instance created in HiveThriftServer2
No description provided.