-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default #1051
Conversation
Now I think about it, perhaps compression should be a setting for individual RDDs? |
I agree !, perhaps part of Storage Levels ? |
@@ -99,6 +99,12 @@ def set(self, key, value): | |||
self._jconf.set(key, unicode(value)) | |||
return self | |||
|
|||
def setIfMissing(self, key, value): | |||
"""Set a configuration property, if not already set.""" | |||
if self.get(key) == None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if key not in self:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may have to make SparkConf iterable for this to work. It does not work as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, yes. but "self.get(key) is None" sounds more Pythonic :-)
Jenkins, test this please |
QA tests have started for PR 1051. This patch merges cleanly. |
QA results for PR 1051: |
QA tests have started for PR 1051. This patch merges cleanly. |
QA results for PR 1051: |
QA tests have started for PR 1051. This patch merges cleanly. |
QA results for PR 1051: |
Thanks Prashant, I've merged this. |
…ion by default Author: Prashant Sharma <prashant.s@imaginea.com> Closes apache#1051 from ScrapCodes/SPARK-2014/pyspark-cache and squashes the following commits: f192df7 [Prashant Sharma] Code Review 2a2f43f [Prashant Sharma] [SPARK-2014] Make PySpark store RDDs in MEMORY_ONLY_SER with compression by default
Hi guys, Could anyone please explain why we don't want to use MEMORY_ONLY in PySpark by default? Thanks a lot |
…ization is passed (#1051) * [CARMEL-6121] Skip subsequent checks after the first authorization is passed * [CARMEL-6121] Skip subsequent checks after the first authorization is passed
…igure.sh(fix/refactor) (apache#1051)
No description provided.