Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default #1051

Closed
wants to merge 2 commits into from

Conversation

ScrapCodes
Copy link
Member

No description provided.

@rxin
Copy link
Contributor

rxin commented Jun 12, 2014

Now I think about it, perhaps compression should be a setting for individual RDDs?

@ScrapCodes
Copy link
Member Author

I agree !, perhaps part of Storage Levels ?

@@ -99,6 +99,12 @@ def set(self, key, value):
self._jconf.set(key, unicode(value))
return self

def setIfMissing(self, key, value):
"""Set a configuration property, if not already set."""
if self.get(key) == None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if key not in self:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may have to make SparkConf iterable for this to work. It does not work as is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes. but "self.get(key) is None" sounds more Pythonic :-)

@mateiz
Copy link
Contributor

mateiz commented Jul 23, 2014

Jenkins, test this please

@SparkQA
Copy link

SparkQA commented Jul 23, 2014

QA tests have started for PR 1051. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17058/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 23, 2014

QA results for PR 1051:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17058/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA tests have started for PR 1051. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17104/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA results for PR 1051:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17104/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA tests have started for PR 1051. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17111/consoleFull

@SparkQA
Copy link

SparkQA commented Jul 24, 2014

QA results for PR 1051:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/17111/consoleFull

@mateiz
Copy link
Contributor

mateiz commented Jul 25, 2014

Thanks Prashant, I've merged this.

@asfgit asfgit closed this in eff9714 Jul 25, 2014
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
…ion by default

Author: Prashant Sharma <prashant.s@imaginea.com>

Closes apache#1051 from ScrapCodes/SPARK-2014/pyspark-cache and squashes the following commits:

f192df7 [Prashant Sharma] Code Review
2a2f43f [Prashant Sharma] [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default
@ScrapCodes ScrapCodes deleted the SPARK-2014/pyspark-cache branch June 3, 2015 06:01
@Taco-W
Copy link

Taco-W commented Feb 2, 2017

Hi guys,

Could anyone please explain why we don't want to use MEMORY_ONLY in PySpark by default?

Thanks a lot

wangyum pushed a commit that referenced this pull request May 26, 2023
…ization is passed (#1051)

* [CARMEL-6121] Skip subsequent checks after the first authorization is passed

* [CARMEL-6121] Skip subsequent checks after the first authorization is passed
udaynpusa pushed a commit to mapr/spark that referenced this pull request Jan 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants