[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336

davies · 2014-09-09T22:04:56Z

Aggregate the number of bytes spilled into disks during aggregation or sorting, show them in Web UI.

This patch is blocked by SPARK-3465. (It includes a fix for that).

SparkQA · 2014-09-10T22:24:15Z

QA tests have started for PR 2336 at commit fbe9029.

This patch merges cleanly.

SparkQA · 2014-09-10T22:26:26Z

QA tests have finished for PR 2336 at commit fbe9029.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-10T22:33:45Z

QA tests have started for PR 2336 at commit fbe9029.

This patch merges cleanly.

SparkQA · 2014-09-10T22:34:44Z

QA tests have finished for PR 2336 at commit fbe9029.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-11T04:50:26Z

QA tests have started for PR 2336 at commit fbe9029.

This patch merges cleanly.

SparkQA · 2014-09-11T04:50:57Z

QA tests have started for PR 2336 at commit fbe9029.

This patch merges cleanly.

SparkQA · 2014-09-11T04:51:31Z

QA tests have finished for PR 2336 at commit fbe9029.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-11T04:52:00Z

QA tests have finished for PR 2336 at commit fbe9029.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-11T05:21:57Z

QA tests have started for PR 2336 at commit fbe9029.

This patch merges cleanly.

SparkQA · 2014-09-11T05:22:57Z

QA tests have finished for PR 2336 at commit fbe9029.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2014-09-11T05:53:53Z

QA tests have started for PR 2336 at commit 7e4ad04.

This patch merges cleanly.

SparkQA · 2014-09-11T05:55:08Z

QA tests have finished for PR 2336 at commit 7e4ad04.

This patch fails unit tests.
This patch merges cleanly.
This patch adds no public classes.

andrewor14 · 2014-09-11T21:57:01Z

core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala

@@ -242,7 +242,8 @@ class JobProgressListener(conf: SparkConf) extends SparkListener with Logging {
            t.taskMetrics)

          // Overwrite task metrics
-          t.taskMetrics = Some(taskMetrics)
+          // FIXME: deepcopy the metrics, or they will be the same object in local mode
+          t.taskMetrics = Some(scala.util.Marshal.load[TaskMetrics](scala.util.Marshal.dump(taskMetrics)))


Do we want to do something similar to what you did in #2338 here, i.e. do it only if this is local mode?

I will rebase it after #2338 is merged.

SparkQA · 2014-09-12T00:19:25Z

QA tests have started for PR 2336 at commit 1245eb7.

This patch merges cleanly.

SparkQA · 2014-09-12T01:23:30Z

QA tests have finished for PR 2336 at commit 1245eb7.

This patch passes unit tests.
This patch merges cleanly.
This patch adds the following public classes (experimental):
- case class CreateTableAsSelect(
- case class CreateTableAsSelect(

JoshRosen · 2014-09-13T21:51:13Z

python/pyspark/worker.py

@@ -27,12 +27,11 @@
 # copy_reg module.
 from pyspark.accumulators import _accumulatorRegistry
 from pyspark.broadcast import Broadcast, _broadcastRegistry
-from pyspark.cloudpickle import CloudPickler


A few lines prior to this, there was a comment

# CloudPickler needs to be imported so that depicklers are registered using the # copy_reg module.

If this import is no longer necessary (was it ever?), then we should delete that comment, too.

couldpickle is imported by serializers, so it's not needed here. The comments are removed.

JoshRosen · 2014-09-13T21:58:06Z

This looks good to me.

SparkQA · 2014-09-14T04:19:17Z

QA tests have started for PR 2336 at commit e37df38.

This patch merges cleanly.

SparkQA · 2014-09-14T05:25:17Z

QA tests have finished for PR 2336 at commit e37df38.

This patch passes unit tests.
This patch merges cleanly.
This patch adds no public classes.

show spilled bytes in Python in web ui

fbe9029

Merge branch 'master' into metrics

7e4ad04

andrewor14 reviewed Sep 11, 2014
View reviewed changes

davies added 2 commits September 11, 2014 17:10

Merge branch 'master' into metrics

ebd2f43

remove the temporary fix

1245eb7

JoshRosen reviewed Sep 13, 2014
View reviewed changes

remove outdated comments

e37df38

asfgit closed this in 4e3fbe8 Sep 14, 2014

davies deleted the metrics branch September 15, 2014 22:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336

[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336

davies commented Sep 9, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

andrewor14 Sep 11, 2014

davies Sep 12, 2014

SparkQA commented Sep 12, 2014

SparkQA commented Sep 12, 2014

JoshRosen Sep 13, 2014

davies Sep 14, 2014

JoshRosen commented Sep 13, 2014

SparkQA commented Sep 14, 2014

SparkQA commented Sep 14, 2014

[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336

[SPARK-3463] [PySpark] aggregate and show spilled bytes in Python #2336

Conversation

davies commented Sep 9, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 10, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

SparkQA commented Sep 11, 2014

andrewor14 Sep 11, 2014

Choose a reason for hiding this comment

davies Sep 12, 2014

Choose a reason for hiding this comment

SparkQA commented Sep 12, 2014

SparkQA commented Sep 12, 2014

JoshRosen Sep 13, 2014

Choose a reason for hiding this comment

davies Sep 14, 2014

Choose a reason for hiding this comment

JoshRosen commented Sep 13, 2014

SparkQA commented Sep 14, 2014

SparkQA commented Sep 14, 2014