REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27

markhamstra · 2014-11-25T17:26:56Z

Just the last three commits are the ByteswapPartitioner addition.

…r file" This reverts commit 098f83c.

This reverts commit 685bdd2.

This reverts commit 72a4fdb.

This is the 1.1 version of apache#3302. There has been some refactoring in master so we can't cherry-pick that PR. Author: Andrew Or <andrew@databricks.com> Closes apache#3330 from andrewor14/sort-fetch-fail and squashes the following commits: 486fc49 [Andrew Or] Reset `elementsRead`

…sks; use HashedWheelTimer (For branch-1.1) This patch is intended to fix a subtle memory leak in ConnectionManager's ACK timeout TimerTasks: in the old code, each TimerTask held a reference to the message being sent and a cancelled TimerTask won't necessarily be garbage-collected until it's scheduled to run, so this caused huge buildups of messages that weren't garbage collected until their timeouts expired, leading to OOMs. This patch addresses this problem by capturing only the message ID in the TimerTask instead of the whole message, and by keeping a WeakReference to the promise in the TimerTask. I've also modified this code to use Netty's HashedWheelTimer, whose performance characteristics should be better for this use-case. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes apache#3321 from sarutak/connection-manager-timeout-bugfix and squashes the following commits: 786af91 [Kousuke Saruta] Fixed memory leak issue of ConnectionManager

Spark hangs with the following code: ~~~ sc.parallelize(1 to 10).zipWithIndex.repartition(10).count() ~~~ This is because ZippedWithIndexRDD triggers a job in getPartitions and it causes a deadlock in DAGScheduler.getPreferredLocs (synced). The fix is to compute `startIndices` during construction. This should be applied to branch-1.0, branch-1.1, and branch-1.2. pwendell Author: Xiangrui Meng <meng@databricks.com> Closes apache#3291 from mengxr/SPARK-4433 and squashes the following commits: c284d9f [Xiangrui Meng] fix a racing condition in zipWithIndex (cherry picked from commit bb46046) Signed-off-by: Xiangrui Meng <meng@databricks.com>

[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3338)  Author: Cheng Lian <lian@databricks.com> Closes apache#3338 from liancheng/spark-3334-for-1.1 and squashes the following commits: bd17512 [Cheng Lian] Backports apache#3334 to branch-1.1

This is the branch-1.1 version of apache#3243. Author: Andrew Or <andrew@databricks.com> Closes apache#3355 from andrewor14/spill-log-bytes-1.1 and squashes the following commits: 36ec152 [Andrew Or] Log more precise representation of bytes in spilling code

Conflicts: assembly/pom.xml bagel/pom.xml core/pom.xml examples/pom.xml external/flume-sink/pom.xml external/flume/pom.xml external/kafka/pom.xml external/mqtt/pom.xml external/twitter/pom.xml external/zeromq/pom.xml extras/kinesis-asl/pom.xml extras/spark-ganglia-lgpl/pom.xml graphx/pom.xml mllib/pom.xml pom.xml repl/pom.xml sql/catalyst/pom.xml sql/core/pom.xml sql/hive-thriftserver/pom.xml sql/hive/pom.xml streaming/pom.xml tools/pom.xml yarn/pom.xml yarn/stable/pom.xml

This is the branch-1.1 version of apache#3353. This requires a separate PR because the code in master has been refactored a little to eliminate duplicate code. I have tested this on a standalone cluster. The goal is to merge this into 1.1.1. Author: Andrew Or <andrew@databricks.com> Closes apache#3354 from andrewor14/avoid-small-spills-1.1 and squashes the following commits: f2e552c [Andrew Or] Fix tests 7012595 [Andrew Or] Avoid many small spills

[maven-release-plugin] copy for tag v1.1.1-rc2 Conflicts: assembly/pom.xml bagel/pom.xml core/pom.xml examples/pom.xml external/flume-sink/pom.xml external/flume/pom.xml external/kafka/pom.xml external/mqtt/pom.xml external/twitter/pom.xml external/zeromq/pom.xml extras/kinesis-asl/pom.xml extras/spark-ganglia-lgpl/pom.xml graphx/pom.xml mllib/pom.xml pom.xml repl/pom.xml sql/catalyst/pom.xml sql/core/pom.xml sql/hive-thriftserver/pom.xml sql/hive/pom.xml streaming/pom.xml tools/pom.xml yarn/pom.xml yarn/stable/pom.xml

…treamFunctions.saveAsNewAPIHadoopFiles Solves two JIRAs in one shot - Makes the ForechDStream created by saveAsNewAPIHadoopFiles serializable for checkpoints - Makes the default configuration object used saveAsNewAPIHadoopFiles be the Spark's hadoop configuration Author: Tathagata Das <tathagata.das1565@gmail.com> Closes apache#3457 from tdas/savefiles-fix and squashes the following commits: bb4729a [Tathagata Das] Same treatment for saveAsHadoopFiles b382ea9 [Tathagata Das] Fix serialization issue in PairDStreamFunctions.saveAsNewAPIHadoopFiles. (cherry picked from commit 8838ad7) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>

This commit provides a script that computes the contributors list by linking the github commits with JIRA issues. Automatically translating github usernames remains a TODO at this point.

…) registered with the scheduler v1.1 backport for apache#3483 Author: roxchkplusony <roxchkplusony@gmail.com> Closes apache#3503 from roxchkplusony/bugfix/4626-1.1 and squashes the following commits: 234d350 [roxchkplusony] [SPARK-4626] Kill a task only if the executorId is (still) registered with the scheduler

…empDir() `File.exists()` and `File.mkdirs()` only throw `SecurityException` instead of `IOException`. Then, when an exception is thrown, `dir` should be reset too. Author: Liang-Chi Hsieh <viirya@gmail.com> Closes apache#3449 from viirya/fix_createtempdir and squashes the following commits: 36cacbd [Liang-Chi Hsieh] Use proper exception and reset variable. (cherry picked from commit 49fe879) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

This PR adds the Spark version number to the UI footer; this is how it looks: ![screen shot 2014-11-21 at 22 58 40](https://cloud.githubusercontent.com/assets/822522/5157738/f4822094-7316-11e4-98f1-333a535fdcfa.png) Author: Sean Owen <sowen@cloudera.com> Closes apache#3410 from srowen/SPARK-2143 and squashes the following commits: e9b3a7a [Sean Owen] Add Spark version to footer

org.apache.spark.SPARK_VERSION is new in 1.2; in earlier versions, we have to use SparkContext.SPARK_VERSION.

[<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3498)  Author: Cheng Lian <lian@databricks.com> Closes apache#3498 from liancheng/fix-sql-doc-typo and squashes the following commits: 865ecd7 [Cheng Lian] Fixes formatting typo in SQL programming guide (cherry picked from commit 2a4d389) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

The link points to the old scala programming guide; it should point to the submitting applications page. This should be backported to 1.1.2 (it's been broken as of 1.0). Author: Kay Ousterhout <kayousterhout@gmail.com> Closes apache#3542 from kayousterhout/SPARK-4686 and squashes the following commits: a8fc43b [Kay Ousterhout] [SPARK-4686] Link to allowed master URLs is broken (cherry picked from commit d9a148b) Signed-off-by: Kay Ousterhout <kayousterhout@gmail.com>

Modified typo. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes apache#3560 from tsudukim/feature/SPARK-4701 and squashes the following commits: ed2a3f1 [Masayoshi TSUZUKI] Another whitespace position error. 1af3a35 [Masayoshi TSUZUKI] [SPARK-4701] Typo in sbt/sbt (cherry picked from commit 96786e3) Signed-off-by: Andrew Or <andrew@databricks.com>

ShuffleMemoryManager.tryToAcquire may return a negative value. The unit test demonstrates this bug. It will output `0 did not equal -200 granted is negative`. Author: zsxwing <zsxwing@gmail.com> Closes apache#3575 from zsxwing/SPARK-4715 and squashes the following commits: a193ae6 [zsxwing] Make sure tryToAcquire won't return a negative value

…N document. Added descriptions about these parameters. - spark.yarn.queue Modified description about the defalut value of this parameter. - spark.yarn.submit.file.replication Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes apache#3500 from tsudukim/feature/SPARK-4642 and squashes the following commits: ce99655 [Masayoshi TSUZUKI] better gramatically. 21cf624 [Masayoshi TSUZUKI] Removed intentionally undocumented properties. 88cac9b [Masayoshi TSUZUKI] [SPARK-4642] Documents about running-on-YARN needs update

…ver adds Executor The ExecutorInfo only reaches the RUNNING state if the Driver is alive to send the ExecutorStateChanged message to master. Else, appInfo.resetRetryCount() is never called and failing Executors will eventually exceed ApplicationState.MAX_NUM_RETRY, resulting in the application being removed from the master's accounting. Author: Mark Hamstra <markhamstra@gmail.com> Closes apache#3550 from markhamstra/SPARK-4498 and squashes the following commits: 8f543b1 [Mark Hamstra] Don't transition ExecutorInfo to RUNNING until Executor is added by Driver

Rewording was based on this discussion: http://apache-spark-developers-list.1001551.n3.nabble.com/RDD-data-flow-td9804.html This is the associated JIRA ticket: https://issues.apache.org/jira/browse/SPARK-4884 Author: Madhu Siddalingaiah <madhu@madhu.com> Closes apache#3722 from msiddalingaiah/master and squashes the following commits: 79e679f [Madhu Siddalingaiah] [DOC]: improve documentation 51d14b9 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 38faca4 [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' cbccbfe [Madhu Siddalingaiah] Documentation: replace <b> with <code> (again) 332f7a2 [Madhu Siddalingaiah] Documentation: replace <b> with <code> cd2b05a [Madhu Siddalingaiah] Merge remote-tracking branch 'upstream/master' 0fc12d7 [Madhu Siddalingaiah] Documentation: add description for repartitionAndSortWithinPartitions (cherry picked from commit d5a596d) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

Author: Sandy Ryza <sandy@cloudera.com> Closes apache#3684 from sryza/sandy-spark-3428 and squashes the following commits: cb827fe [Sandy Ryza] SPARK-3428. TaskMetrics for running tasks is missing GC time metrics (cherry picked from commit 283263f) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

Author: Ryan Williams <ryan.blake.williams@gmail.com> Closes apache#2848 from ryan-williams/fetch-file and squashes the following commits: c14daff [Ryan Williams] Fix copy that was changed to a move inadvertently 8e39c16 [Ryan Williams] code review feedback 788ed41 [Ryan Williams] don’t redundantly overwrite executor JAR deps (cherry picked from commit 7981f96) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala

…file Since we can set spark executor memory and executor cores using property file, we must also be allowed to set the executor instances. Author: Kanwaljit Singh <kanwaljit.singh@guavus.com> Closes apache#1657 from kjsingh/branch-1.0 and squashes the following commits: d8a5a12 [Kanwaljit Singh] SPARK-2641: Fixing how spark arguments are loaded from properties file for num executors Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala

Mvn Build Failed: value defaultProperties not found .Maybe related to this pr: apache@1d64812 andrewor14 can you look at this problem? Author: huangzhaowei <carlmartinmax@gmail.com> Closes apache#3749 from SaintBacchus/Mvn-Build-Fail and squashes the following commits: 8e2917c [huangzhaowei] Build Failed: value defaultProperties not found (cherry picked from commit a764960) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

In Scala, `map` and `flatMap` of `Iterable` will copy the contents of `Iterable` to a new `Seq`. Such as, ```Scala val iterable = Seq(1, 2, 3).map(v => { println(v) v }) println("Iterable map done") val iterator = Seq(1, 2, 3).iterator.map(v => { println(v) v }) println("Iterator map done") ``` outputed ``` 1 2 3 Iterable map done Iterator map done ``` So we should use 'iterator' to reduce memory consumed by join. Found by Johannes Simon in http://mail-archives.apache.org/mod_mbox/spark-user/201412.mbox/%3C5BE70814-9D03-4F61-AE2C-0D63F2DE4446%40mail.de%3E Author: zsxwing <zsxwing@gmail.com> Closes apache#3671 from zsxwing/SPARK-4824 and squashes the following commits: 48ee7b9 [zsxwing] Remove the explicit types 95d59d6 [zsxwing] Add 'iterator' to reduce memory consumed by join (cherry picked from commit c233ab3) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala

…stered Once the streaming receiver is de-registered at executor, the `ReceiverTrackerActor` needs to remove the corresponding reveiverInfo from the `receiverInfo` map at `ReceiverTracker`. Author: Ilayaperumal Gopinathan <igopinathan@pivotal.io> Closes apache#3647 from ilayaperumalg/receiverInfo-RTracker and squashes the following commits: 6eb97d5 [Ilayaperumal Gopinathan] Polishing based on the review 3640c86 [Ilayaperumal Gopinathan] Remove receiverInfo once receiver is de-registered (cherry picked from commit 10d69e9) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com> Conflicts: streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala

Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#3460 from vanzin/SPARK-4606 and squashes the following commits: 031207d [Marcelo Vanzin] [SPARK-4606] Send EOF to child JVM when there's no more data to read. (cherry picked from commit 7e2deb7) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

Add `processingDelay`, `schedulingDelay` and `totalDelay` for the last completed batch. Add `lastReceivedBatchRecords` and `totalReceivedBatchRecords` to the received records counting. Author: jerryshao <saisai.shao@intel.com> Closes apache#3466 from jerryshao/SPARK-4537 and squashes the following commits: 00f5f7f [jerryshao] Change the code style and add totalProcessedRecords 44721a6 [jerryshao] Further address the comments c097ddc [jerryshao] Address the comments 02dd44f [jerryshao] Fix the addressed comments c7a9376 [jerryshao] Expand StreamingSource to add more metrics (cherry picked from commit f205fe4) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>

….environmentDetails Author: GuoQiang Li <witgo@qq.com> Closes apache#3788 from witgo/SPARK-4952 and squashes the following commits: d903529 [GuoQiang Li] Handle ConcurrentModificationExceptions in SparkEnv.environmentDetails (cherry picked from commit 080ceb7) Signed-off-by: Patrick Wendell <pwendell@gmail.com>

This helps to avoid build breaks when backporting patches that use org.apache.spark.SPARK_VERSION.

… with KryoSerializer This PR fixes an issue where PySpark broadcast variables caused NullPointerExceptions if KryoSerializer was used. The fix is to register PythonBroadcast with Kryo so that it's deserialized with a KryoJavaSerializer. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3831 from JoshRosen/SPARK-4882 and squashes the following commits: 0466c7a [Josh Rosen] Register PythonBroadcast with Kryo. d5b409f [Josh Rosen] Enable registrationRequired, which would have caught this bug. 069d8a7 [Josh Rosen] Add failing test for SPARK-4882 (cherry picked from commit efa80a5) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

…rk works with KryoSerializer" This reverts commit 822a0b4. This fix does not apply to branch-1.1 or branch-1.0, since PythonBroadcast is new in 1.2.

…e 'spurious wakeup' Used `Condition` to rewrite `ContextWaiter` because it provides a convenient API `awaitNanos` for timeout. Author: zsxwing <zsxwing@gmail.com> Closes apache#3661 from zsxwing/SPARK-4813 and squashes the following commits: 52247f5 [zsxwing] Add explicit unit type be42bcf [zsxwing] Update as per review suggestion e06bd4f [zsxwing] Fix the issue that ContextWaiter didn't handle 'spurious wakeup' (cherry picked from commit 6a89782) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>

Several of our tests call System.setProperty (or test code which implicitly sets system properties) and don't always reset/clear the modified properties, which can create ordering dependencies between tests and cause hard-to-diagnose failures. This patch removes most uses of System.setProperty from our tests, since in most cases we can use SparkConf to set these configurations (there are a few exceptions, including the tests of SparkConf itself). For the cases where we continue to use System.setProperty, this patch introduces a `ResetSystemProperties` ScalaTest mixin class which snapshots the system properties before individual tests and to automatically restores them on test completion / failure. See the block comment at the top of the ResetSystemProperties class for more details. Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3739 from JoshRosen/cleanup-system-properties-in-tests and squashes the following commits: 0236d66 [Josh Rosen] Replace setProperty uses in two example programs / tools 3888fe3 [Josh Rosen] Remove setProperty use in LocalJavaStreamingContext 4f4031d [Josh Rosen] Add note on why SparkSubmitSuite needs ResetSystemProperties 4742a5b [Josh Rosen] Clarify ResetSystemProperties trait inheritance ordering. 0eaf0b6 [Josh Rosen] Remove setProperty call in TaskResultGetterSuite. 7a3d224 [Josh Rosen] Fix trait ordering 3fdb554 [Josh Rosen] Remove setProperty call in TaskSchedulerImplSuite bee20df [Josh Rosen] Remove setProperty calls in SparkContextSchedulerCreationSuite 655587c [Josh Rosen] Remove setProperty calls in JobCancellationSuite 3f2f955 [Josh Rosen] Remove System.setProperty calls in DistributedSuite cfe9cce [Josh Rosen] Remove use of system properties in SparkContextSuite 8783ab0 [Josh Rosen] Remove TestUtils.setSystemProperty, since it is subsumed by the ResetSystemProperties trait. 633a84a [Josh Rosen] Remove use of system properties in FileServerSuite 25bfce2 [Josh Rosen] Use ResetSystemProperties in UtilsSuite 1d1aa5a [Josh Rosen] Use ResetSystemProperties in SizeEstimatorSuite dd9492b [Josh Rosen] Use ResetSystemProperties in AkkaUtilsSuite b0daff2 [Josh Rosen] Use ResetSystemProperties in BlockManagerSuite e9ded62 [Josh Rosen] Use ResetSystemProperties in TaskSchedulerImplSuite 5b3cb54 [Josh Rosen] Use ResetSystemProperties in SparkListenerSuite 0995c4b [Josh Rosen] Use ResetSystemProperties in SparkContextSchedulerCreationSuite c83ded8 [Josh Rosen] Use ResetSystemProperties in SparkConfSuite 51aa870 [Josh Rosen] Use withSystemProperty in ShuffleSuite 60a63a1 [Josh Rosen] Use ResetSystemProperties in JobCancellationSuite 14a92e4 [Josh Rosen] Use withSystemProperty in FileServerSuite 628f46c [Josh Rosen] Use ResetSystemProperties in DistributedSuite 9e3e0dd [Josh Rosen] Add ResetSystemProperties test fixture mixin; use it in SparkSubmitSuite. 4dcea38 [Josh Rosen] Move withSystemProperty to TestUtils class. (cherry picked from commit 352ed6b) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/test/scala/org/apache/spark/ShuffleSuite.scala core/src/test/scala/org/apache/spark/SparkConfSuite.scala core/src/test/scala/org/apache/spark/SparkContextSchedulerCreationSuite.scala core/src/test/scala/org/apache/spark/SparkContextSuite.scala core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala core/src/test/scala/org/apache/spark/util/UtilsSuite.scala external/flume/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/mqtt/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/twitter/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java external/zeromq/src/test/java/org/apache/spark/streaming/LocalJavaStreamingContext.java tools/src/main/scala/org/apache/spark/tools/StoragePerfTester.scala

…ifest. Resolves a bug where the `Main-Class` from a .jar file wasn't being read in properly. This was caused by the fact that the `primaryResource` object was a URI and needed to be normalized through a call to `.getPath` before it could be passed into the `JarFile` object. Author: Brennon York <brennon.york@capitalone.com> Closes apache#3561 from brennonyork/SPARK-4298 and squashes the following commits: 5e0fce1 [Brennon York] Use string interpolation for error messages, moved comment line from original code to above its necessary code segment 14daa20 [Brennon York] pushed mainClass assignment into match statement, removed spurious spaces, removed { } from case statements, removed return values c6dad68 [Brennon York] Set case statement to support multiple jar URI's and enabled the 'file' URI to load the main-class 8d20936 [Brennon York] updated to reset the error message back to the default a043039 [Brennon York] updated to split the uri and jar vals 8da7cbf [Brennon York] fixes SPARK-4298 (cherry picked from commit 8e14c5e) Signed-off-by: Josh Rosen <joshrosen@databricks.com> Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala

This should fix a major cause of build breaks when running many parallel tests.

…able Spark Streaming's ReceiverMessage trait should extend Serializable in order to fix a subtle bug that only occurs when running on a real cluster: If you attempt to send a fire-and-forget message to a remote Akka actor and that message cannot be serialized, then this seems to lead to more-or-less silent failures. As an optimization, Akka skips message serialization for messages sent within the same JVM. As a result, Spark's unit tests will never fail due to non-serializable Akka messages, but these will cause mostly-silent failures when running on a real cluster. Before this patch, here was the code for ReceiverMessage: ``` /** Messages sent to the NetworkReceiver. */ private[streaming] sealed trait ReceiverMessage private[streaming] object StopReceiver extends ReceiverMessage ``` Since ReceiverMessage does not extend Serializable and StopReceiver is a regular `object`, not a `case object`, StopReceiver will throw serialization errors. As a result, graceful receiver shutdown is broken on real clusters (and local-cluster mode) but works in local modes. If you want to reproduce this, try running the word count example from the Streaming Programming Guide in the Spark shell: ``` import org.apache.spark._ import org.apache.spark.streaming._ import org.apache.spark.streaming.StreamingContext._ val ssc = new StreamingContext(sc, Seconds(10)) // Create a DStream that will connect to hostname:port, like localhost:9999 val lines = ssc.socketTextStream("localhost", 9999) // Split each line into words val words = lines.flatMap(_.split(" ")) import org.apache.spark.streaming.StreamingContext._ // Count each word in each batch val pairs = words.map(word => (word, 1)) val wordCounts = pairs.reduceByKey(_ + _) // Print the first ten elements of each RDD generated in this DStream to the console wordCounts.print() ssc.start() Thread.sleep(10000) ssc.stop(true, true) ``` Prior to this patch, this would work correctly in local mode but fail when running against a real cluster (it would report that some receivers were not shut down). Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3857 from JoshRosen/SPARK-5035 and squashes the following commits: 71d0eae [Josh Rosen] [SPARK-5035] ReceiverMessage trait should extend Serializable. (cherry picked from commit fe6efac) Signed-off-by: Tathagata Das <tathagata.das1565@gmail.com>

The job launched by DriverSuite should bind the web UI to an ephemeral port, since it looks like port contention in this test has caused a large number of Jenkins failures when many builds are started simultaneously. Our tests already disable the web UI, but this doesn't affect subprocesses launched by our tests. In this case, I've opted to bind to an ephemeral port instead of disabling the UI because disabling features in this test may mask its ability to catch certain bugs. See also: e24d3a9 Author: Josh Rosen <joshrosen@databricks.com> Closes apache#3873 from JoshRosen/driversuite-webui-port and squashes the following commits: 48cd05c [Josh Rosen] [HOTFIX] Bind web UI to ephemeral port in DriverSuite. (cherry picked from commit 0128398) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

Author: Dale <tigerquoll@outlook.com> Closes apache#3809 from tigerquoll/SPARK-4787 and squashes the following commits: 5661e01 [Dale] [SPARK-4787] Ensure that call to stop() doesn't lose the exception by using a finally block. 2172578 [Dale] [SPARK-4787] Stop context properly if an exception occurs during DAGScheduler initialization. (cherry picked from commit 3fddc94) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

SPARK-5132: stageInfoToJson: Stage Attempt Id stageInfoFromJson: Attempt Id Author: hushan[胡珊] <hushan@xiaomi.com> Closes apache#3932 from suyanNone/json-stage and squashes the following commits: 41419ab [hushan[胡珊]] Correct stage Attempt Id key in stageInfofromJson (cherry picked from commit d345ebe) Signed-off-by: Josh Rosen <joshrosen@databricks.com>

vnivargi · 2015-01-09T22:14:23Z

failed?

* Support custom labels on the driver pod. * Add integration test and fix logic. * Fix tests * Fix minor formatting mistake * Reduce unnecessary diff

Andrew Or and others added 30 commits November 17, 2014 11:25

Revert "[SPARK-4075] [Deploy] Jar url validation is not enough for Ja…

b528367

…r file" This reverts commit 098f83c.

Revert "[maven-release-plugin] prepare for next development iteration"

cf8d0ef

This reverts commit 685bdd2.

Revert "[maven-release-plugin] prepare release v1.1.1-rc1"

e4f5695

This reverts commit 72a4fdb.

Update CHANGES.txt for 1.1.1-rc2

aa3c794

[maven-release-plugin] prepare release v1.1.1-rc2

3693ae5

[maven-release-plugin] prepare for next development iteration

1df1c1d

Fixed merge typo

1b2b7dd

Update versions to 1.1.2-SNAPSHOT

6371737

[HOTFIX] Fixing broken build due to missing imports.

1a7f414

[Release] Automate generation of contributors list

a59c445

This commit provides a script that computes the contributors list by linking the github commits with JIRA issues. Automatically translating github usernames remains a TODO at this point.

[HOTFIX] Fix build break in 1a2508b

90d90b2

org.apache.spark.SPARK_VERSION is new in 1.2; in earlier versions, we have to use SparkContext.SPARK_VERSION.

[Release] Translate unknown author names automatically

aec20af

msiddalingaiah and others added 25 commits December 18, 2014 16:01

[HOTFIX] Add SPARK_VERSION to Spark package object.

d5e0a45

This helps to avoid build breaks when backporting patches that use org.apache.spark.SPARK_VERSION.

Revert "[SPARK-4882] Register PythonBroadcast with Kryo so that PySpa…

d6b8d2c

…rk works with KryoSerializer" This reverts commit 822a0b4. This fix does not apply to branch-1.1 or branch-1.0, since PythonBroadcast is new in 1.2.

[HOTFIX] Disable Spark UI in SparkSubmitSuite tests

1034707

This should fix a major cause of build breaks when running many parallel tests.

Merge branch 'branch-1.1' of github.com:apache/spark into csd-1.1

c07a691

Added ByteswapPartitioner and made it the defaultPartitioner

677281e

Added spark.default.partitioner to conf

470f026

spark.default.partitioner docs

99910b1

markhamstra changed the title ~~SKIPME Merged Apache Spark 1.1.1~~ REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner Jan 9, 2015

markhamstra assigned vnivargi Jan 9, 2015

markhamstra closed this Jan 9, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27

REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27

markhamstra commented Nov 25, 2014

vnivargi commented Jan 9, 2015

REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27

REL-505 merge Apache branch-1.1 bug fixes and add new ByteswapPartitioner #27

Conversation

markhamstra commented Nov 25, 2014

vnivargi commented Jan 9, 2015