[SPARK-39846][CORE] Enable `spark.dynamicAllocation.shuffleTracking.e…

…nabled` by default ### What changes were proposed in this pull request? This PR aims to enable `spark.dynamicAllocation.shuffleTracking.enabled` by default in Apache Spark 3.4 when `spark.dynamicAllocation.enabled=true` and `spark.shuffle.service.enabled=false` ### Why are the changes needed? Here is a brief history around `spark.dynamicAllocation.shuffleTracking.enabled`. - Apache Spark 3.0.0 added it via SPARK-27963 for K8s environment. > One immediate use case is the ability to use dynamic allocation on Kubernetes, which doesn't yet have that service. - Apache Spark 3.1.1 made K8s GA via SPARK-33005 and started to used it in K8s widely. - Apache Spark 3.2.0 started to support shuffle data recovery on the reused PVCs via SPARK-35593 - Apache Spark 3.3.0 removed `Experimental` tag from it via SPARK-39322. - Apache Spark 3.4.0 will enable it by default via SPARK-39846 (this PR) to help Spark K8s users to dynamic allocation more easily. ### Does this PR introduce _any_ user-facing change? The `Core` migration guide is updated. ### How was this patch tested? Pass the CIs including K8s IT GitHub Action job. Closes apache#37257 from dongjoon-hyun/SPARK-39846. Authored-by: Dongjoon Hyun <dongjoon@apache.org> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
sunchao · Jul 23, 2022 · d762205 · d762205
1 parent ff5fc74
commit d762205
Show file tree

Hide file tree

Showing 3 changed files with 4 additions and 2 deletions.
diff --git a/core/src/main/scala/org/apache/spark/internal/config/package.scala b/core/src/main/scala/org/apache/spark/internal/config/package.scala
@@ -651,7 +651,7 @@ package object config {
     ConfigBuilder("spark.dynamicAllocation.shuffleTracking.enabled")
       .version("3.0.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   private[spark] val DYN_ALLOCATION_SHUFFLE_TRACKING_TIMEOUT =
     ConfigBuilder("spark.dynamicAllocation.shuffleTracking.timeout")

diff --git a/docs/configuration.md b/docs/configuration.md
@@ -2732,7 +2732,7 @@ Apart from these, the following properties are also available, and may be useful
 </tr>
 <tr>
   <td><code>spark.dynamicAllocation.shuffleTracking.enabled</code></td>
-  <td><code>false</code></td>
+  <td><code>true</code></td>
   <td>
     Enables shuffle file tracking for executors, which allows dynamic allocation
     without the need for an external shuffle service. This option will try to keep alive executors

diff --git a/docs/core-migration-guide.md b/docs/core-migration-guide.md
@@ -26,6 +26,8 @@ license: |
 
 - Since Spark 3.4, Spark driver will own `PersistentVolumnClaim`s and try to reuse if they are not assigned to live executors. To restore the behavior before Spark 3.4, you can set `spark.kubernetes.driver.ownPersistentVolumeClaim` to `false` and `spark.kubernetes.driver.reusePersistentVolumeClaim` to `false`.
 
+- Since Spark 3.4, Spark driver will track shuffle data when dynamic allocation is enabled without shuffle service. To restore the behavior before Spark 3.4, you can set `spark.dynamicAllocation.shuffleTracking.enabled` to `false`.
+
 ## Upgrading from Core 3.2 to 3.3
 
 - Since Spark 3.3, Spark migrates its log4j dependency from 1.x to 2.x because log4j 1.x has reached end of life and is no longer supported by the community. Vulnerabilities reported after August 2015 against log4j 1.x were not checked and will not be fixed. Users should rewrite original log4j properties files using log4j2 syntax (XML, JSON, YAML, or properties format). Spark rewrites the `conf/log4j.properties.template` which is included in Spark distribution, to `conf/log4j2.properties.template` with log4j2 properties format.