Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-18639] Build only a single pip package #16072

Closed
wants to merge 1 commit into from

Conversation

rxin
Copy link
Contributor

@rxin rxin commented Nov 30, 2016

What changes were proposed in this pull request?

We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable.

How was this patch tested?

N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases.

Copy link
Contributor

@JoshRosen JoshRosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine to me, especially since the non-whitespace-change diff is really tiny: https://github.com/apache/spark/pull/16072/files?w=0

@@ -150,6 +150,7 @@ if [[ "$1" == "package" ]]; then
NAME=$1
FLAGS=$2
ZINC_PORT=$3
BUILD_PIP_PACKAGE=$4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style of positional boolean flag won't really work if you ever need to have more than one of these arguments. That said, this is fine for now and if we're really going to add any more complexity here then we should probably just rewrite this whole script in Python.

@SparkQA
Copy link

SparkQA commented Nov 30, 2016

Test build #69371 has finished for PR 16072 at commit 88b53c3.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please

@SparkQA
Copy link

SparkQA commented Nov 30, 2016

Test build #69384 has started for PR 16072 at commit 88b53c3.

@SparkQA
Copy link

SparkQA commented Nov 30, 2016

Test build #3445 has finished for PR 16072 at commit 88b53c3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@rxin
Copy link
Contributor Author

rxin commented Dec 2, 2016

Merging in master/branch-2.1.

asfgit pushed a commit that referenced this pull request Dec 2, 2016
## What changes were proposed in this pull request?
We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable.

## How was this patch tested?
N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases.

Author: Reynold Xin <rxin@databricks.com>

Closes #16072 from rxin/SPARK-18639.

(cherry picked from commit 37e52f8)
Signed-off-by: Reynold Xin <rxin@databricks.com>
@asfgit asfgit closed this in 37e52f8 Dec 2, 2016
robert3005 pushed a commit to palantir/spark that referenced this pull request Dec 2, 2016
## What changes were proposed in this pull request?
We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable.

## How was this patch tested?
N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases.

Author: Reynold Xin <rxin@databricks.com>

Closes apache#16072 from rxin/SPARK-18639.
uzadude pushed a commit to uzadude/spark that referenced this pull request Jan 27, 2017
## What changes were proposed in this pull request?
We current build 5 separate pip binary tar balls, doubling the release script runtime. It'd be better to build one, especially for use cases that are just using Spark locally. In the long run, it would make more sense to have Hadoop support be pluggable.

## How was this patch tested?
N/A - this is a release build script that doesn't have any automated test coverage. We will know if it goes wrong when we prepare releases.

Author: Reynold Xin <rxin@databricks.com>

Closes apache#16072 from rxin/SPARK-18639.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants