-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4410][SQL] Add support for external sort #3268
Conversation
Test build #23381 has started for PR 3268 at commit
|
Test build #23381 has finished for PR 3268 at commit
|
Test PASSed. |
82b787a
to
b98799d
Compare
Test build #23396 has started for PR 3268 at commit
|
Test build #23396 has finished for PR 3268 at commit
|
Test PASSed. |
@@ -189,6 +191,7 @@ case class TakeOrdered(limit: Int, sortOrder: Seq[SortOrder], child: SparkPlan) | |||
|
|||
/** | |||
* :: DeveloperApi :: | |||
* Performs a sort on-heap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we document the parameters, e.g. "global" for both Sort and ExternalSort?
LGTM other than the minor comment. One thing I noticed is that we'd want to control the closure size at some point. Right now the entire query plan is being captured by every stage. |
@@ -17,6 +17,8 @@ | |||
|
|||
package org.apache.spark.sql.execution | |||
|
|||
import org.apache.spark.util.collection.ExternalSorter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import order here
Test build #23449 has started for PR 3268 at commit
|
Test build #23449 has finished for PR 3268 at commit
|
Test PASSed. |
Merging in master & branch-1.2. Thanks! |
Adds a new operator that uses Spark's `ExternalSort` class. It is off by default now, but we might consider making it the default if benchmarks show that it does not regress performance. Author: Michael Armbrust <michael@databricks.com> Closes #3268 from marmbrus/externalSort and squashes the following commits: 48b9726 [Michael Armbrust] comments b98799d [Michael Armbrust] Add test afd7562 [Michael Armbrust] Add support for external sort. (cherry picked from commit 64c6b9b) Signed-off-by: Reynold Xin <rxin@databricks.com>
Adds a new operator that uses Spark's
ExternalSort
class. It is off by default now, but we might consider making it the default if benchmarks show that it does not regress performance.