-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19900][core]Remove driver when relaunching. #17888
Conversation
Can one of the admins verify this patch? |
Could you expand more on this change? It would be great if some failing test cases could be provided! Also cc @cloud-fan , since the origin JIRA issue has been closed, should we close this PR? |
Thanks for reviewing. Basically, the problem is that when relaunching a driver and later the original driver reconnect, there will be an application which does not have a corresponding driver. I will try to add a test case. |
As the reporter gives the code to reproduce the bug, I reopened that ticket |
## What changes were proposed in this pull request? This PR adds the new unit tests to support ShuffleDataBlockId , ShuffleIndexBlockId , TempShuffleBlockId , TempLocalBlockId ## How was this patch tested? The new unit test. Author: caoxuewen <cao.xuewen@zte.com.cn> Closes apache#17794 from heary-cao/blockidsuite.
## What changes were proposed in this pull request? This PR adds documentation to the ALS code. ## How was this patch tested? Existing tests were used. mengxr srowen This contribution is my original work. I have the license to work on this project under the Spark project’s open source license. Author: Daniel Li <dan@danielyli.com> Closes apache#17793 from danielyli/spark-20484.
…tore access. ## What changes were proposed in this pull request? Add a new `spark-hadoop-cloud` module and maven profile to pull in object store support from `hadoop-openstack`, `hadoop-aws` and `hadoop-azure` (Hadoop 2.7+) JARs, along with their dependencies, fixing up the dependencies so that everything works, in particular Jackson. It restores `s3n://` access to S3, adds its `s3a://` replacement, OpenStack `swift://` and azure `wasb://`. There's a documentation page, `cloud_integration.md`, which covers the basic details of using Spark with object stores, referring the reader to the supplier's own documentation, with specific warnings on security and the possible mismatch between a store's behavior and that of a filesystem. In particular, users are advised be very cautious when trying to use an object store as the destination of data, and to consult the documentation of the storage supplier and the connector. (this is the successor to apache#12004; I can't re-open it) ## How was this patch tested? Downstream tests exist in [https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples](https://github.com/steveloughran/spark-cloud-examples/tree/master/cloud-examples) Those verify that the dependencies are sufficient to allow downstream applications to work with s3a, azure wasb and swift storage connectors, and perform basic IO & dataframe operations thereon. All seems well. Manually clean build & verify that assembly contains the relevant aws-* hadoop-* artifacts on Hadoop 2.6; azure on a hadoop-2.7 profile. SBT build: `build/sbt -Phadoop-cloud -Phadoop-2.7 package` maven build `mvn install -Phadoop-cloud -Phadoop-2.7` This PR *does not* update `dev/deps/spark-deps-hadoop-2.7` or `dev/deps/spark-deps-hadoop-2.6`, because unless the hadoop-cloud profile is enabled, no extra JARs show up in the dependency list. The dependency check in Jenkins isn't setting the property, so the new JARs aren't visible. Author: Steve Loughran <stevel@apache.org> Author: Steve Loughran <stevel@hortonworks.com> Closes apache#17834 from steveloughran/cloud/SPARK-7481-current.
## What changes were proposed in this pull request? add environment ## How was this patch tested? wait for appveyor run Author: Felix Cheung <felixcheung_m@hotmail.com> Closes apache#17878 from felixcheung/appveyorrcran.
## What changes were proposed in this pull request? * Docs are consistent (across different `unix_timestamp` variants and their internal expressions) * typo hunting ## How was this patch tested? local build Author: Jacek Laskowski <jacek@japila.pl> Closes apache#17801 from jaceklaskowski/unix_timestamp.
## What changes were proposed in this pull request? - Add SparkR wrapper for `Dataset.alias`. - Adjust roxygen annotations for `functions.alias` (including example usage). ## How was this patch tested? Unit tests, `check_cran.sh`. Author: zero323 <zero323@users.noreply.github.com> Closes apache#17825 from zero323/SPARK-20550.
## What changes were proposed in this pull request? Adds Python wrappers for `DataFrameWriter.bucketBy` and `DataFrameWriter.sortBy` ([SPARK-16931](https://issues.apache.org/jira/browse/SPARK-16931)) ## How was this patch tested? Unit tests covering new feature. __Note__: Based on work of GregBowyer (f49b9a2) CC HyukjinKwon Author: zero323 <zero323@users.noreply.github.com> Author: Greg Bowyer <gbowyer@fastmail.co.uk> Closes apache#17077 from zero323/SPARK-16931.
## What changes were proposed in this pull request? This change allows timestamps in parquet-based hive table to behave as a "floating time", without a timezone, as timestamps are for other file formats. If the storage timezone is the same as the session timezone, this conversion is a no-op. When data is read from a hive table, the table property is *always* respected. This allows spark to not change behavior when reading old data, but read newly written data correctly (whatever the source of the data is). Spark inherited the original behavior from Hive, but Hive is also updating behavior to use the same scheme in HIVE-12767 / HIVE-16231. The default for Spark remains unchanged; created tables do not include the new table property. This will only apply to hive tables; nothing is added to parquet metadata to indicate the timezone, so data that is read or written directly from parquet files will never have any conversions applied. ## How was this patch tested? Added a unit test which creates tables, reads and writes data, under a variety of permutations (different storage timezones, different session timezones, vectorized reading on and off). Author: Imran Rashid <irashid@cloudera.com> Closes apache#16781 from squito/SPARK-12297.
## What changes were proposed in this pull request? set timezone on windows ## How was this patch tested? unit test, AppVeyor Author: Felix Cheung <felixcheung_m@hotmail.com> Closes apache#17892 from felixcheung/rtimestamptest.
…LTER TABLE SET/UNSET TBLPROPERTIES ddl ### What changes were proposed in this pull request? Table comment was not getting set/unset using **ALTER TABLE SET/UNSET TBLPROPERTIES** query eg: ALTER TABLE table_with_comment SET TBLPROPERTIES("comment"= "modified comment) when user alter the table properties and adds/updates table comment,table comment which is a field of **CatalogTable** instance is not getting updated and old table comment if exists was shown to user, inorder to handle this issue, update the comment field value in **CatalogTable** with the newly added/modified comment along with other table level properties when user executes **ALTER TABLE SET TBLPROPERTIES** query. This pr has also taken care of unsetting the table comment when user executes query **ALTER TABLE UNSET TBLPROPERTIES** inorder to unset or remove table comment. eg: ALTER TABLE table_comment UNSET TBLPROPERTIES IF EXISTS ('comment') ### How was this patch tested? Added test cases as part of **SQLQueryTestSuite** for verifying table comment using desc formatted table query after adding/modifying table comment as part of **AlterTableSetPropertiesCommand** and unsetting the table comment using **AlterTableUnsetPropertiesCommand**. Author: sujith71955 <sujithchacko.2010@gmail.com> Closes apache#17649 from sujith71955/alter_table_comment.
## What changes were proposed in this pull request? Fix typo in vignettes Author: Wayne Zhang <actuaryzhang@uber.com> Closes apache#17884 from actuaryzhang/typo.
…ptions Signed-off-by: liuxian <liu.xian3zte.com.cn> ## What changes were proposed in this pull request? When the input parameter is null, may be a runtime exception occurs ## How was this patch tested? Existing unit tests Author: liuxian <liu.xian3@zte.com.cn> Closes apache#17796 from 10110346/wip_lx_0428.
… information ## What changes were proposed in this pull request? When call the method getLocations of BlockManager, we only compare the data block host. Random selection for non-local data blocks, this may cause the selected data block to be in a different rack. So in this patch to increase the sort of the rack. ## How was this patch tested? New test case. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Xianyang Liu <xianyang.liu@intel.com> Closes apache#17300 from ConeyLiu/blockmanager.
… cases Existing test cases for `recommendForAllX` methods (added in [SPARK-19535](https://issues.apache.org/jira/browse/SPARK-19535)) test `k < num items` and `k = num items`. Technically we should also test that `k > num items` returns the same results as `k = num items`. ## How was this patch tested? Updated existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#17860 from MLnick/SPARK-20596-als-rec-tests.
…v.sh' ## What changes were proposed in this pull request? Currently, `spark.executor.instances` is deprecated in `spark-env.sh`, because we suggest config it in `spark-defaults.conf` or other config file. And also this parameter is useless even if you set it in `spark-env.sh`, so remove it in this patch. ## How was this patch tested? Existing tests. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Xianyang Liu <xianyang.liu@intel.com> Closes apache#17881 from ConeyLiu/deprecatedParam.
…ort configuration ## What changes were proposed in this pull request? After SPARK-10997, client mode Netty RpcEnv doesn't require to start server, so port configurations are not used any more, here propose to remove these two configurations: "spark.executor.port" and "spark.am.port". ## How was this patch tested? Existing UTs. Author: jerryshao <sshao@hortonworks.com> Closes apache#17866 from jerryshao/SPARK-20605.
## What changes were proposed in this pull request? Cleaning existing temp tables before running tableNames tests ## How was this patch tested? SparkR Unit tests Author: Hossein <hossein@databricks.com> Closes apache#17903 from falaki/SPARK-20661.
## What changes were proposed in this pull request? Change it to check for relative count like in this test https://github.com/apache/spark/blame/master/R/pkg/inst/tests/testthat/test_sparkSQL.R#L3355 for catalog APIs ## How was this patch tested? unit tests, this needs to combine with another commit with SQL change to check Author: Felix Cheung <felixcheung_m@hotmail.com> Closes apache#17905 from felixcheung/rtabletests.
The recommendForAll of MLLIB ALS is very slow. GC is a key problem of the current method. The task use the following code to keep temp result: val output = new Array[(Int, (Int, Double))](m*n) m = n = 4096 (default value, no method to set) so output is about 4k * 4k * (4 + 4 + 8) = 256M. This is a large memory and cause serious GC problem, and it is frequently OOM. Actually, we don't need to save all the temp result. Support we recommend topK (topK is about 10, or 20) product for each user, we only need 4k * topK * (4 + 4 + 8) memory to save the temp result. The Test Environment: 3 workers: each work 10 core, each work 30G memory, each work 1 executor. The Data: User 480,000, and Item 17,000 BlockSize: 1024 2048 4096 8192 Old method: 245s 332s 488s OOM This solution: 121s 118s 117s 120s The existing UT. Author: Peng <peng.meng@intel.com> Author: Peng Meng <peng.meng@intel.com> Closes apache#17742 from mpjlu/OptimizeAls.
This PR is a `DataFrame` version of apache#17742 for [SPARK-11968](https://issues.apache.org/jira/browse/SPARK-11968), for improving the performance of `recommendAll` methods. ## How was this patch tested? Existing unit tests. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#17845 from MLnick/ml-als-perf.
…ception ## What changes were proposed in this pull request? Added a check for for the number of defined values. Previously the argmax function assumed that at least one value was defined if the vector size was greater than zero. ## How was this patch tested? Tests were added to the existing VectorsSuite to cover this case. Author: Jon McLean <jon.mclean@atsid.com> Closes apache#17877 from jonmclean/vectorArgmaxIndexBug.
## What changes were proposed in this pull request? Remove ML methods we deprecated in 2.1. ## How was this patch tested? Existing tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes apache#17867 from yanboliang/spark-20606.
…ting the package of sql/core and sql/hive ## What changes were proposed in this pull request? So far, we do not drop all the cataloged objects after each package. Sometimes, we might hit strange test case errors because the previous test suite did not drop the cataloged/temporary objects (tables/functions/database). At least, we can first clean up the environment when completing the package of `sql/core` and `sql/hive`. ## How was this patch tested? N/A Author: Xiao Li <gatorsmile@gmail.com> Closes apache#17908 from gatorsmile/reset.
## What changes were proposed in this pull request? This pr added parsing rules to support aliases in table value functions. ## How was this patch tested? Added tests in `PlanParserSuite`. Author: Takeshi Yamamuro <yamamuro@apache.org> Closes apache#17666 from maropu/SPARK-20311.
## What changes were proposed in this pull request? Replaced specific sequence encoders with generic sequence encoder to enable nesting of sequences. Does not add support for nested arrays as that cannot be solved in this way. ## How was this patch tested? ```bash build/mvn -DskipTests clean package && dev/run-tests ``` Additionally in Spark shell: ``` scala> Seq(Seq(Seq(1))).toDS.collect() res0: Array[Seq[Seq[Int]]] = Array(List(List(1))) ``` Author: Michal Senkyr <mike.senkyr@gmail.com> Closes apache#18011 from michalsenkyr/dataset-seq-nested.
…ing from Breeze sparse matrix ## What changes were proposed in this pull request? When two Breeze SparseMatrices are operated, the result matrix may contain provisional 0 values extra in rowIndices and data arrays. This causes an incoherence with the colPtrs data, but Breeze get away with this incoherence by keeping a counter of the valid data. In spark, when this matrices are converted to SparseMatrices, Sparks relies solely on rowIndices, data, and colPtrs, but these might be incorrect because of breeze internal hacks. Therefore, we need to slice both rowIndices and data, using their counter of active data This method is at least called by BlockMatrix when performing distributed block operations, causing exceptions on valid operations. See http://stackoverflow.com/questions/33528555/error-thrown-when-using-blockmatrix-add ## How was this patch tested? Added a test to MatricesSuite that verifies that the conversions are valid and that code doesn't crash. Originally the same code would crash on Spark. Bugfix for https://issues.apache.org/jira/browse/SPARK-20687 Author: Ignacio Bermudez <ignaciobermudez@gmail.com> Author: Ignacio Bermudez Corrales <icorrales@splunk.com> Closes apache#17940 from ghoto/bug-fix/SPARK-20687.
…or 2.2 Quick follow up to apache#17996 - forgot to add the HTML links to the relevant sections of the guide in the highlights list. ## How was this patch tested? Built docs locally and tested links. Author: Nick Pentreath <nickp@za.ibm.com> Closes apache#18043 from MLnick/SPARK-20506-2.2-migration-guide-2.
…and job detail page on spark web ui when speculative task(s) exist. ## What changes were proposed in this pull request? Modified succeeded num in job detail page from "completed = stageData.completedIndices.size" to "completed = stageData.numCompleteTasks",which making succeeded tasks num in all jobs page and job detail page look more consistent, and more easily to find which stages the speculative task(s) were in. ## How was this patch tested? manual tests Author: fjh100456 <fu.jinhua6@zte.com.cn> Closes apache#17923 from fjh100456/master.
… spark_* system directory ## What changes were proposed in this pull request? This PR solution to run the SortShuffleSuite unit tests have residual spark_* system directory For example: OS:Windows 7 After the running SortShuffleSuite unit tests, the system of TMP directory have '..\AppData\Local\Temp\spark-f64121f9-11b4-4ffd-a4f0-cfca66643503' not deleted ## How was this patch tested? Run SortShuffleSuite unit test. Author: caoxuewen <cao.xuewen@zte.com.cn> Closes apache#17869 from heary-cao/SortShuffleSuite.
… not working ## What changes were proposed in this pull request? On status column of the table, I removed the condition that forced only the display value to take on values Active, Blacklisted and Dead. Before the removal, values used for sort and filter for that particular column was True and False. ## How was this patch tested? Tested with Active, Blacklisted and Dead present as current status. Author: John Lee <jlee2@yahoo-inc.com> Closes apache#18036 from yoonlee95/SPARK-20813.
…bove threshold. ## What changes were proposed in this pull request? Currently, when number of reduces is above 2000, HighlyCompressedMapStatus is used to store size of blocks. in HighlyCompressedMapStatus, only average size is stored for non empty blocks. Which is not good for memory control when we shuffle blocks. It makes sense to store the accurate size of block when it's above threshold. ## How was this patch tested? Added test in MapStatusSuite. Author: jinxing <jinxing6042@126.com> Closes apache#18031 from jinxing64/SPARK-20801.
…T EXISTS ### What changes were proposed in this pull request? Currently, we have a bug when we specify `IF NOT EXISTS` in `INSERT OVERWRITE` data source tables. For example, given a query: ```SQL INSERT OVERWRITE TABLE $tableName partition (b=2, c=3) IF NOT EXISTS SELECT 9, 10 ``` we will get the following error: ``` unresolved operator 'InsertIntoTable Relation[a#425,d#426,b#427,c#428] parquet, Map(b -> Some(2), c -> Some(3)), true, true;; 'InsertIntoTable Relation[a#425,d#426,b#427,c#428] parquet, Map(b -> Some(2), c -> Some(3)), true, true +- Project [cast(9#423 as int) AS a#429, cast(10#424 as int) AS d#430] +- Project [9 AS 9#423, 10 AS 10#424] +- OneRowRelation$ ``` This PR is to fix the issue to follow the behavior of Hive serde tables > INSERT OVERWRITE will overwrite any existing data in the table or partition unless IF NOT EXISTS is provided for a partition ### How was this patch tested? Modified an existing test case Author: gatorsmile <gatorsmile@gmail.com> Closes apache#18050 from gatorsmile/insertPartitionIfNotExists.
…es and degreesOfFreedom in LR and GLR - Python version ## What changes were proposed in this pull request? SPARK-20097 exposed degreesOfFreedom in LinearRegressionSummary and numInstances in GeneralizedLinearRegressionSummary. Python API should be updated to reflect these changes. ## How was this patch tested? The existing UT Author: Peng <peng.meng@intel.com> Closes apache#18062 from mpjlu/spark-20764.
and contains scala classes ## What changes were proposed in this pull request? This change ensures that all references to guava from within the yarn shuffle jar pointed to the shaded guava class already provided in the jar. Also, it explicitly excludes scala classes from being added to the jar. ## How was this patch tested? Ran unit tests on the module and they passed. javap now returns the expected result - reference to the shaded guava under `org/spark_project` (previously this was referring to `com.google...` ``` javap -cp common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar -c org/apache/spark/network/yarn/YarnShuffleService | grep Lists 57: invokestatic apache#138 // Method org/spark_project/guava/collect/Lists.newArrayList:()Ljava/util/ArrayList; ``` Guava is still shaded in the jar: ``` jar -tf common/network-yarn/target/scala-2.11/spark-2.3.0-SNAPSHOT-yarn-shuffle.jar | grep guava | head META-INF/maven/com.google.guava/ META-INF/maven/com.google.guava/guava/ META-INF/maven/com.google.guava/guava/pom.properties META-INF/maven/com.google.guava/guava/pom.xml org/spark_project/guava/ org/spark_project/guava/annotations/ org/spark_project/guava/annotations/Beta.class org/spark_project/guava/annotations/GwtCompatible.class org/spark_project/guava/annotations/GwtIncompatible.class org/spark_project/guava/annotations/VisibleForTesting.class ``` (not sure if the above META-INF/* is a problem or not) I took this jar, deployed it on a yarn cluster with shuffle service enabled, and made sure the YARN node managers came up. An application with a shuffle was run and it succeeded. Author: Mark Grover <mark@apache.org> Closes apache#17990 from markgrover/spark-20756.
## What changes were proposed in this pull request? support decision tree in R ## How was this patch tested? added tests Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes apache#17981 from zhengruifeng/dt_r.
Restore code that was removed as part of SPARK-17979, but instead of using the deprecated env variable name to propagate the class path, use a new one. Verified by running "./bin/spark-class o.a.s.executor.CoarseGrainedExecutorBackend" manually. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#18037 from vanzin/SPARK-20814.
## What changes were proposed in this pull request? Add cot test in MathExpressionsSuite as apache#17999 (comment). ## How was this patch tested? unit tests Author: Yuming Wang <wgyumg@gmail.com> Closes apache#18039 from wangyum/SPARK-20751-test.
…ntImpl ### What changes were proposed in this pull request? After we adding a new field `stats` into `CatalogTable`, we should not expose Hive-specific Stats metadata to `MetastoreRelation`. It complicates all the related codes. It also introduces a bug in `SHOW CREATE TABLE`. The statistics-related table properties should be skipped by `SHOW CREATE TABLE`, since it could be incorrect in the newly created table. See the Hive JIRA: https://issues.apache.org/jira/browse/HIVE-13792 Also fix the issue to fill Hive-generated RowCounts to our stats. This PR is to handle Hive-specific Stats metadata in `HiveClientImpl`. ### How was this patch tested? Added a few test cases. Author: Xiao Li <gatorsmile@gmail.com> Closes apache#14971 from gatorsmile/showCreateTableNew.
…festForR ## What changes were proposed in this pull request? - Add a null check to RPackageUtils#checkManifestForR so that jars w/o manifests don't NPE. ## How was this patch tested? - Unit tests and manual tests. Author: James Shuster <jshuster@palantir.com> Closes apache#18040 from jrshust/feature/r-package-utils.
## What changes were proposed in this pull request? This change skips tests that use the Hadoop libraries while running on CRAN check with Windows as the operating system. This is to handle cases where the Hadoop winutils binaries are missing on the target system. The skipped tests consist of 1. Tests that save, load a model in MLlib 2. Tests that save, load CSV, JSON and Parquet files in SQL 3. Hive tests ## How was this patch tested? Tested by running on a local windows VM with HADOOP_HOME unset. Also testing with https://win-builder.r-project.org Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu> Closes apache#17966 from shivaram/sparkr-windows-cran.
… parsing consistent with old sql parser behavior ## What changes were proposed in this pull request? As srowen pointed in apache@609ba5f#commitcomment-22221259, the previous tests are not proper. This follow-up is going to fix the tests. ## How was this patch tested? Jenkins tests. Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#18048 from viirya/SPARK-20399-follow-up.
…ar SVM summary. ## What changes were proposed in this pull request? Joint coefficients with intercept for SparkR linear SVM summary. ## How was this patch tested? Existing tests. Author: Yanbo Liang <ybliang8@gmail.com> Closes apache#18035 from yanboliang/svm-r.
## What changes were proposed in this pull request? This patch renames BroadcastHint to ResolvedHint (and Hint to UnresolvedHint) so the hint framework is more generic and would allow us to introduce other hint types in the future without introducing new hint nodes. ## How was this patch tested? Updated test cases. Author: Reynold Xin <rxin@databricks.com> Closes apache#18072 from rxin/SPARK-20857.
The contribution is my original work and I license the work to the project under the project’s open source license. Note: the Teradata JDBC connector limits the row size to 64K. The default string datatype equivalent I used is a 255 character/byte length varchar. This effectively limits the max number of string columns to 250 when using the Teradata jdbc connector. ## What changes were proposed in this pull request? Added a teradataDialect for JDBC connection to Teradata. The Teradata dialect uses VARCHAR(255) in place of TEXT for string datatypes, and CHAR(1) in place of BIT(1) for boolean datatypes. ## How was this patch tested? I added two unit tests to double check that the types get set correctly for a teradata jdbc url. I also ran a couple manual tests to make sure the jdbc connector worked with teradata and to make sure that an error was thrown if a row could potentially exceed 64K (this error comes from the teradata jdbc connector, not from the spark code). I did not check how string columns longer than 255 characters are handled. Author: Kirby Linvill <kirby.linvill@teradata.com> Author: klinvill <kjlinvill@gmail.com> Closes apache#16746 from klinvill/master.
Sorry, it seems I make a mistake when rebase. I will open another pr. |
This is apache#17888 . Below are some spark ui snapshots. Master, after worker disconnects: <img width="1433" alt="master_disconnect" src="https://cloud.githubusercontent.com/assets/2576762/26398687/d0ee228e-40ac-11e7-986d-d3b57b87029f.png"> Master, after worker reconnects, notice the `running drivers` part: <img width="1412" alt="master_reconnects" src="https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png"> This patch, after worker disconnects: <img width="1412" alt="patch_disconnect" src="https://cloud.githubusercontent.com/assets/2576762/26398009/c015d3dc-40aa-11e7-8bb4-df11a1f66645.png"> This patch, after worker reconnects: data:image/s3,"s3://crabby-images/b0606/b06064b044c50ee995a17e29cd30752ddda0bbe0" alt="image" cc cloud-fan jiangxb1987 Author: Li Yichao <lyc@zhihu.com> Closes apache#18084 from liyichao/SPARK-19900-1.
This is apache#17888 . Below are some spark ui snapshots. Master, after worker disconnects: <img width="1433" alt="master_disconnect" src="https://cloud.githubusercontent.com/assets/2576762/26398687/d0ee228e-40ac-11e7-986d-d3b57b87029f.png"> Master, after worker reconnects, notice the `running drivers` part: <img width="1412" alt="master_reconnects" src="https://cloud.githubusercontent.com/assets/2576762/26398697/d50735a4-40ac-11e7-80d8-6e9e1cf0b62f.png"> This patch, after worker disconnects: <img width="1412" alt="patch_disconnect" src="https://cloud.githubusercontent.com/assets/2576762/26398009/c015d3dc-40aa-11e7-8bb4-df11a1f66645.png"> This patch, after worker reconnects: data:image/s3,"s3://crabby-images/b0606/b06064b044c50ee995a17e29cd30752ddda0bbe0" alt="image" cc cloud-fan jiangxb1987 Author: Li Yichao <lyc@zhihu.com> Closes apache#18084 from liyichao/SPARK-19900-1.
What changes were proposed in this pull request?
How was this patch tested?
manual tests
@CodingCat