Repository: spark Updated Branches: refs/heads/branch-1.2 e74ce14e8 -> 7b5ba85df
[SPARK-4914][Build] Cleans lib_managed before compiling with Hive 0.13.1 This PR tries to fix the Hive tests failure encountered in PR #3157 by cleaning `lib_managed` before building assembly jar against Hive 0.13.1 in `dev/run-tests`. Otherwise two sets of datanucleus jars would be left in `lib_managed` and may mess up class paths while executing Hive test suites. Please refer to [this thread] [1] for details. A clean build would be even safer, but we only clean `lib_managed` here to save build time. This PR also takes the chance to clean up some minor typos and formatting issues in the comments. [1]: https://github.com/apache/spark/pull/3157#issuecomment-67656488 <!-- Reviewable:start --> [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/3756) <!-- Reviewable:end --> Author: Cheng Lian <l...@databricks.com> Closes #3756 from liancheng/clean-lib-managed and squashes the following commits: e2bd21d [Cheng Lian] Adds lib_managed to clean set c9f2f3e [Cheng Lian] Cleans lib_managed before compiling with Hive 0.13.1 (cherry picked from commit 395b771feed3fc90e5053defbe86dbd673fde582) Signed-off-by: Josh Rosen <joshro...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7b5ba85d Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7b5ba85d Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7b5ba85d Branch: refs/heads/branch-1.2 Commit: 7b5ba85df342159f4b62f507bf1daa4dafe59059 Parents: e74ce14 Author: Cheng Lian <l...@databricks.com> Authored: Tue Dec 23 12:54:20 2014 -0800 Committer: Josh Rosen <joshro...@databricks.com> Committed: Tue Dec 23 12:54:33 2014 -0800 ---------------------------------------------------------------------- dev/run-tests | 26 ++++++++++++++------------ pom.xml | 3 +++ 2 files changed, 17 insertions(+), 12 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/7b5ba85d/dev/run-tests ---------------------------------------------------------------------- diff --git a/dev/run-tests b/dev/run-tests index 328a73b..9192cb7 100755 --- a/dev/run-tests +++ b/dev/run-tests @@ -141,20 +141,22 @@ CURRENT_BLOCK=$BLOCK_BUILD { # NOTE: echo "q" is needed because sbt on encountering a build file with failure - #+ (either resolution or compilation) prompts the user for input either q, r, etc - #+ to quit or retry. This echo is there to make it not block. + # (either resolution or compilation) prompts the user for input either q, r, etc + # to quit or retry. This echo is there to make it not block. # NOTE: Do not quote $BUILD_MVN_PROFILE_ARGS or else it will be interpreted as a - #+ single argument! + # single argument! # QUESTION: Why doesn't 'yes "q"' work? # QUESTION: Why doesn't 'grep -v -e "^\[info\] Resolving"' work? - # First build with 0.12 to ensure patches do not break the hive 12 build + # First build with Hive 0.12.0 to ensure patches do not break the Hive 0.12.0 build HIVE_12_BUILD_ARGS="$SBT_MAVEN_PROFILES_ARGS -Phive -Phive-thriftserver -Phive-0.12.0" - echo "[info] Compile with hive 0.12" + echo "[info] Compile with Hive 0.12.0" echo -e "q\n" \ | sbt/sbt $HIVE_12_BUILD_ARGS clean hive/compile hive-thriftserver/compile \ | grep -v -e "info.*Resolving" -e "warn.*Merging" -e "info.*Including" - # Then build with default version(0.13.1) because tests are based on this version + # Then build with default Hive version (0.13.1) because tests are based on this version + echo "[info] Compile with Hive 0.13.1" + rm -rf lib_managed echo "[info] Building Spark with these arguments: $SBT_MAVEN_PROFILES_ARGS"\ " -Phive -Phive-thriftserver" echo -e "q\n" \ @@ -178,7 +180,7 @@ CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS if [ -n "$_SQL_TESTS_ONLY" ]; then # This must be an array of individual arguments. Otherwise, having one long string - #+ will be interpreted as a single test, which doesn't work. + # will be interpreted as a single test, which doesn't work. SBT_MAVEN_TEST_ARGS=("catalyst/test" "sql/test" "hive/test" "mllib/test") else SBT_MAVEN_TEST_ARGS=("test") @@ -187,11 +189,11 @@ CURRENT_BLOCK=$BLOCK_SPARK_UNIT_TESTS echo "[info] Running Spark tests with these arguments: $SBT_MAVEN_PROFILES_ARGS ${SBT_MAVEN_TEST_ARGS[@]}" # NOTE: echo "q" is needed because sbt on encountering a build file with failure - #+ (either resolution or compilation) prompts the user for input either q, r, etc - #+ to quit or retry. This echo is there to make it not block. + # (either resolution or compilation) prompts the user for input either q, r, etc + # to quit or retry. This echo is there to make it not block. # NOTE: Do not quote $SBT_MAVEN_PROFILES_ARGS or else it will be interpreted as a - #+ single argument! - #+ "${SBT_MAVEN_TEST_ARGS[@]}" is cool because it's an array. + # single argument! + # "${SBT_MAVEN_TEST_ARGS[@]}" is cool because it's an array. # QUESTION: Why doesn't 'yes "q"' work? # QUESTION: Why doesn't 'grep -v -e "^\[info\] Resolving"' work? echo -e "q\n" \ @@ -210,7 +212,7 @@ CURRENT_BLOCK=$BLOCK_PYSPARK_UNIT_TESTS echo "" echo "=========================================================================" -echo "Detecting binary incompatibilites with MiMa" +echo "Detecting binary incompatibilities with MiMa" echo "=========================================================================" CURRENT_BLOCK=$BLOCK_MIMA http://git-wip-us.apache.org/repos/asf/spark/blob/7b5ba85d/pom.xml ---------------------------------------------------------------------- diff --git a/pom.xml b/pom.xml index a8b7424..a9e7c9d 100644 --- a/pom.xml +++ b/pom.xml @@ -1034,6 +1034,9 @@ <fileset> <directory>checkpoint</directory> </fileset> + <fileset> + <directory>lib_managed</directory> + </fileset> </filesets> </configuration> </plugin> --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org