Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-7017][Build][Project Infra]: Refactor dev/run-tests into Python #5694

Closed
wants to merge 57 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
6126c4f
refactored run-tests into python
Apr 24, 2015
3c53a1a
uncomment the scala tests :)
Apr 24, 2015
639f1e9
updated with pep8 rules, fixed minor bugs, added run-tests file in ba…
Apr 26, 2015
2cb413b
upcased global variables, changes various calling methods from check_…
Apr 27, 2015
ec03bf3
added namedtuple for java version to add readability
Apr 27, 2015
07210a9
minor doc string change for java version with namedtuple update
Apr 27, 2015
26e18e8
removed unnecessary wait()
Apr 27, 2015
c095fa6
removed another wait() call
Apr 27, 2015
83e80ef
attempt at better python output when called from bash
Apr 27, 2015
b0b2604
comment out import to see if build fails and returns properly
Apr 28, 2015
803143a
removed license file for SparkContext
Apr 28, 2015
a5bd445
reverted license, changed test in shuffle to fail
Apr 28, 2015
7613558
updated to return the proper env variable for return codes
Apr 29, 2015
b37328c
fixed typo and added default return is no error block was found in th…
Apr 29, 2015
56d3cb9
changed test back and commented out import to break compile
Apr 29, 2015
e4a96cc
removed the import error and added license error, fixed the way run-t…
Apr 29, 2015
76335fb
reverted rat license issue for sparkconf
Apr 29, 2015
2386785
Merge remote-tracking branch 'upstream/master' into SPARK-7017
Apr 29, 2015
983f2a2
comment out import to fail build test
Apr 29, 2015
f041d8a
added space from commented import to now test build breaking
Apr 29, 2015
d825aa4
revert build break, add mima break
Apr 29, 2015
9a592ec
reverted mima exclude issue, added pyspark test failure
Apr 30, 2015
1dada6b
reverted pyspark test failure
Apr 30, 2015
afeb093
updated to make sparkR test fail
Apr 30, 2015
b1ca593
reverted the sparkR test
May 1, 2015
703f095
fixed merge conflicts
May 11, 2015
f950010
removed building hive-0.12.0 per SPARK-6908
May 11, 2015
6d0a052
incorporated merge conflicts with SPARK-7249
May 19, 2015
f9deba1
python to python2 and removed newline
May 19, 2015
b1248dc
exec python rather than running python and exiting with return code
May 21, 2015
d90ab2d
fixed merge conflicts, ensured that for regular builds both core and …
May 26, 2015
0629de8
updated to refactor and remove various small bugs, removed pep8 compl…
Jun 5, 2015
8afbe93
made error codes a global
Jun 5, 2015
1f607b1
finalizing revisions to modular tests
Jun 9, 2015
2fcdfc0
testing targte branch dump on jenkins
Jun 10, 2015
1ecca26
fixed merge conflicts
Jun 10, 2015
db7ae6f
reverted SPARK_HOME from start of command
Jun 10, 2015
eb684b6
fixed sbt_test_goals reference error
Jun 10, 2015
2898717
added a change to streaming test to check if it only runs streaming t…
Jun 10, 2015
7d2f5e2
updated python tests to remove unused variable
Jun 10, 2015
60b3d51
prepend rather than append onto PATH
Jun 10, 2015
705d12e
changed example to comply with pep3113 supporting python3
Jun 10, 2015
03fdd7b
fixed the tuple () wraps around example lambda
Jun 10, 2015
b7c72b9
reverting streaming context
Jun 10, 2015
ec1ae78
minor name changes, bug fixes
Jun 14, 2015
aa03d9e
added documentation builds as a top level test component, altered hig…
Jun 15, 2015
0379833
minor doc addition to print the changed modules
Jun 15, 2015
fb85a41
fixed minor set bug
Jun 15, 2015
c42cf9a
unpack set operations with splat (*)
Jun 15, 2015
767a668
fixed path joining issues, ensured docs actually build on doc changes
Jun 16, 2015
2dff136
fixed pep8 whitespace errors
Jun 16, 2015
22edb78
add check if jekyll isn't installed on the path
Jun 16, 2015
05d435b
added check for jekyll install
Jun 16, 2015
8135518
removed the test check for documentation changes until jenkins can ge…
Jun 16, 2015
f9fbe54
reverted doc test change
Jun 16, 2015
3922a85
removed necessary passed in variable
Jun 16, 2015
154ed73
updated finding java binary if JAVA_HOME not set
Jun 16, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 1 addition & 218 deletions dev/run-tests
Original file line number Diff line number Diff line change
Expand Up @@ -17,224 +17,7 @@
# limitations under the License.
#

# Go to the Spark project root directory
FWDIR="$(cd "`dirname $0`"/..; pwd)"
cd "$FWDIR"

# Clean up work directory and caches
rm -rf ./work
rm -rf ~/.ivy2/local/org.apache.spark
rm -rf ~/.ivy2/cache/org.apache.spark

source "$FWDIR/dev/run-tests-codes.sh"

CURRENT_BLOCK=$BLOCK_GENERAL

function handle_error () {
echo "[error] Got a return code of $? on line $1 of the run-tests script."
exit $CURRENT_BLOCK
}


# Build against the right version of Hadoop.
{
if [ -n "$AMPLAB_JENKINS_BUILD_PROFILE" ]; then
if [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop1.0" ]; then
export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=1.2.1"
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.0" ]; then
export SBT_MAVEN_PROFILES_ARGS="-Phadoop-1 -Dhadoop.version=2.0.0-mr1-cdh4.1.1"
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.2" ]; then
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.2"
elif [ "$AMPLAB_JENKINS_BUILD_PROFILE" = "hadoop2.3" ]; then
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0"
fi
fi

if [ -z "$SBT_MAVEN_PROFILES_ARGS" ]; then
export SBT_MAVEN_PROFILES_ARGS="-Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0"
fi
}

export SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Pkinesis-asl"

# Determine Java path and version.
{
if test -x "$JAVA_HOME/bin/java"; then
declare java_cmd="$JAVA_HOME/bin/java"
else
declare java_cmd=java
fi

# We can't use sed -r -e due to OS X / BSD compatibility; hence, all the parentheses.
JAVA_VERSION=$(
$java_cmd -version 2>&1 \
| grep -e "^java version" --max-count=1 \
| sed "s/java version \"\(.*\)\.\(.*\)\.\(.*\)\"/\1\2/"
)

if [ "$JAVA_VERSION" -lt 18 ]; then
echo "[warn] Java 8 tests will not run because JDK version is < 1.8."
fi
}

# Only run Hive tests if there are SQL changes.
# Partial solution for SPARK-1455.
if [ -n "$AMPLAB_JENKINS" ]; then
target_branch="$ghprbTargetBranch"
git fetch origin "$target_branch":"$target_branch"

# AMP_JENKINS_PRB indicates if the current build is a pull request build.
if [ -n "$AMP_JENKINS_PRB" ]; then
# It is a pull request build.
sql_diffs=$(
git diff --name-only "$target_branch" \
| grep -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
)

non_sql_diffs=$(
git diff --name-only "$target_branch" \
| grep -v -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
)

if [ -n "$sql_diffs" ]; then
echo "[info] Detected changes in SQL. Will run Hive test suite."
_RUN_SQL_TESTS=true

if [ -z "$non_sql_diffs" ]; then
echo "[info] Detected no changes except in SQL. Will only run SQL tests."
_SQL_TESTS_ONLY=true
fi
fi
else
# It is a regular build. We should run SQL tests.
_RUN_SQL_TESTS=true
fi
fi

set -o pipefail
trap 'handle_error $LINENO' ERR

echo ""
echo "========================================================================="
echo "Running Apache RAT checks"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_RAT

./dev/check-license

echo ""
echo "========================================================================="
echo "Running Scala style checks"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_SCALA_STYLE

./dev/lint-scala

echo ""
echo "========================================================================="
echo "Running Python style checks"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_PYTHON_STYLE

./dev/lint-python

echo ""
echo "========================================================================="
echo "Building Spark"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_BUILD

{
HIVE_BUILD_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver"
echo "[info] Compile with Hive 0.13.1"
[ -d "lib_managed" ] && rm -rf lib_managed
echo "[info] Building Spark with these arguments: $HIVE_BUILD_ARGS"

if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then
build/mvn $HIVE_BUILD_ARGS clean package -DskipTests
else
echo -e "q\n" \
| build/sbt $HIVE_BUILD_ARGS package assembly/assembly streaming-kafka-assembly/assembly \
| grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including"
fi
}

echo ""
echo "========================================================================="
echo "Detecting binary incompatibilities with MiMa"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_MIMA

./dev/mima

echo ""
echo "========================================================================="
echo "Running Spark unit tests"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS

{
# If the Spark SQL tests are enabled, run the tests with the Hive profiles enabled.
# This must be a single argument, as it is.
if [ -n "$_RUN_SQL_TESTS" ]; then
SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver"
fi

if [ -n "$_SQL_TESTS_ONLY" ]; then
# This must be an array of individual arguments. Otherwise, having one long string
# will be interpreted as a single test, which doesn't work.
SBT_MAVEN_TEST_ARGS=("catalyst/test" "sql/test" "hive/test" "hive-thriftserver/test" "mllib/test")
else
SBT_MAVEN_TEST_ARGS=("test")
fi

echo "[info] Running Spark tests with these arguments: $SBT_MAVEN_PROFILES_ARGS ${SBT_MAVEN_TEST_ARGS[@]}"

if [ "${AMPLAB_JENKINS_BUILD_TOOL}" == "maven" ]; then
build/mvn test $SBT_MAVEN_PROFILES_ARGS --fail-at-end
else
# NOTE: echo "q" is needed because sbt on encountering a build file with failure
# (either resolution or compilation) prompts the user for input either q, r, etc
# to quit or retry. This echo is there to make it not block.
# NOTE: Do not quote $SBT_MAVEN_PROFILES_ARGS or else it will be interpreted as a
# single argument!
# "${SBT_MAVEN_TEST_ARGS[@]}" is cool because it's an array.
# QUESTION: Why doesn't 'yes "q"' work?
# QUESTION: Why doesn't 'grep -v -e "^\[info\] Resolving"' work?
echo -e "q\n" \
| build/sbt $SBT_MAVEN_PROFILES_ARGS "${SBT_MAVEN_TEST_ARGS[@]}" \
| grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including"
fi
}

echo ""
echo "========================================================================="
echo "Running PySpark tests"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_PYSPARK_UNIT_TESTS

# add path for python 3 in jenkins
export PATH="${PATH}:/home/anaconda/envs/py3k/bin"
./python/run-tests

echo ""
echo "========================================================================="
echo "Running SparkR tests"
echo "========================================================================="

CURRENT_BLOCK=$BLOCK_SPARKR_UNIT_TESTS

if [ $(command -v R) ]; then
./R/install-dev.sh
./R/run-tests.sh
else
echo "Ignoring SparkR tests as R was not found in PATH"
fi

exec python -u ./dev/run-tests.py
11 changes: 6 additions & 5 deletions dev/run-tests-codes.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,9 @@ readonly BLOCK_GENERAL=10
readonly BLOCK_RAT=11
readonly BLOCK_SCALA_STYLE=12
readonly BLOCK_PYTHON_STYLE=13
readonly BLOCK_BUILD=14
readonly BLOCK_MIMA=15
readonly BLOCK_SPARK_UNIT_TESTS=16
readonly BLOCK_PYSPARK_UNIT_TESTS=17
readonly BLOCK_SPARKR_UNIT_TESTS=18
readonly BLOCK_DOCUMENTATION=14
readonly BLOCK_BUILD=15
readonly BLOCK_MIMA=16
readonly BLOCK_SPARK_UNIT_TESTS=17
readonly BLOCK_PYSPARK_UNIT_TESTS=18
readonly BLOCK_SPARKR_UNIT_TESTS=19
2 changes: 2 additions & 0 deletions dev/run-tests-jenkins
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,8 @@ done
failing_test="Scala style tests"
elif [ "$test_result" -eq "$BLOCK_PYTHON_STYLE" ]; then
failing_test="Python style tests"
elif [ "$test_result" -eq "$BLOCK_DOCUMENTATION" ]; then
failing_test="to generate documentation"
elif [ "$test_result" -eq "$BLOCK_BUILD" ]; then
failing_test="to build"
elif [ "$test_result" -eq "$BLOCK_MIMA" ]; then
Expand Down
Loading