Repository: spark Updated Branches: refs/heads/master 8dc3987d0 -> 0368ff30d
[SPARK-13973][PYSPARK] Make pyspark fail noisily if IPYTHON or IPYTHON_OPTS are set ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-13973 Following discussion with srowen the IPYTHON and IPYTHON_OPTS variables are removed. If they are set in the user's environment, pyspark will not execute and prints an error message. Failing noisily will force users to remove these options and learn the new configuration scheme, which is much more sustainable and less confusing. ## How was this patch tested? Manual testing; set IPYTHON=1 and verified that the error message prints. Author: pshearer <[email protected]> Author: shearerp <[email protected]> Closes #12528 from shearerp/master. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0368ff30 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0368ff30 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0368ff30 Branch: refs/heads/master Commit: 0368ff30dd55dd2127d4cb196898c7bd437e9d28 Parents: 8dc3987 Author: pshearer <[email protected]> Authored: Sat Apr 30 10:15:20 2016 +0100 Committer: Sean Owen <[email protected]> Committed: Sat Apr 30 10:15:20 2016 +0100 ---------------------------------------------------------------------- bin/pyspark | 32 ++++++++++++-------------------- docs/programming-guide.md | 11 ++++++----- 2 files changed, 18 insertions(+), 25 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/0368ff30/bin/pyspark ---------------------------------------------------------------------- diff --git a/bin/pyspark b/bin/pyspark index a257499..d1fe75a 100755 --- a/bin/pyspark +++ b/bin/pyspark @@ -24,17 +24,11 @@ fi source "${SPARK_HOME}"/bin/load-spark-env.sh export _SPARK_CMD_USAGE="Usage: ./bin/pyspark [options]" -# In Spark <= 1.1, setting IPYTHON=1 would cause the driver to be launched using the `ipython` -# executable, while the worker would still be launched using PYSPARK_PYTHON. -# -# In Spark 1.2, we removed the documentation of the IPYTHON and IPYTHON_OPTS variables and added -# PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS to allow IPython to be used for the driver. -# Now, users can simply set PYSPARK_DRIVER_PYTHON=ipython to use IPython and set -# PYSPARK_DRIVER_PYTHON_OPTS to pass options when starting the Python driver +# In Spark 2.0, IPYTHON and IPYTHON_OPTS are removed and pyspark fails to launch if either option +# is set in the user's environment. Instead, users should set PYSPARK_DRIVER_PYTHON=ipython +# to use IPython and set PYSPARK_DRIVER_PYTHON_OPTS to pass options when starting the Python driver # (e.g. PYSPARK_DRIVER_PYTHON_OPTS='notebook'). This supports full customization of the IPython # and executor Python executables. -# -# For backwards-compatibility, we retain the old IPYTHON and IPYTHON_OPTS variables. # Determine the Python executable to use if PYSPARK_PYTHON or PYSPARK_DRIVER_PYTHON isn't set: if hash python2.7 2>/dev/null; then @@ -44,17 +38,15 @@ else DEFAULT_PYTHON="python" fi -# Determine the Python executable to use for the driver: -if [[ -n "$IPYTHON_OPTS" || "$IPYTHON" == "1" ]]; then - # If IPython options are specified, assume user wants to run IPython - # (for backwards-compatibility) - PYSPARK_DRIVER_PYTHON_OPTS="$PYSPARK_DRIVER_PYTHON_OPTS $IPYTHON_OPTS" - if [ -x "$(command -v jupyter)" ]; then - PYSPARK_DRIVER_PYTHON="jupyter" - else - PYSPARK_DRIVER_PYTHON="ipython" - fi -elif [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then +# Fail noisily if removed options are set +if [[ -n "$IPYTHON" || -n "$IPYTHON_OPTS" ]]; then + echo "Error in pyspark startup:" + echo "IPYTHON and IPYTHON_OPTS are removed in Spark 2.0+. Remove these from the environment and set PYSPARK_DRIVER_PYTHON and PYSPARK_DRIVER_PYTHON_OPTS instead." + exit 1 +fi + +# Default to standard python interpreter unless told otherwise +if [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"$DEFAULT_PYTHON"}" fi http://git-wip-us.apache.org/repos/asf/spark/blob/0368ff30/docs/programming-guide.md ---------------------------------------------------------------------- diff --git a/docs/programming-guide.md b/docs/programming-guide.md index 601dd57..cf6f1d8 100644 --- a/docs/programming-guide.md +++ b/docs/programming-guide.md @@ -240,16 +240,17 @@ use IPython, set the `PYSPARK_DRIVER_PYTHON` variable to `ipython` when running $ PYSPARK_DRIVER_PYTHON=ipython ./bin/pyspark {% endhighlight %} -You can customize the `ipython` command by setting `PYSPARK_DRIVER_PYTHON_OPTS`. For example, to launch -the [IPython Notebook](http://ipython.org/notebook.html) with PyLab plot support: +To use the Jupyter notebook (previously known as the IPython notebook), {% highlight bash %} -$ PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark +$ PYSPARK_DRIVER_PYTHON=jupyter ./bin/pyspark {% endhighlight %} -After the IPython Notebook server is launched, you can create a new "Python 2" notebook from +You can customize the `ipython` or `jupyter` commands by setting `PYSPARK_DRIVER_PYTHON_OPTS`. + +After the Jupyter Notebook server is launched, you can create a new "Python 2" notebook from the "Files" tab. Inside the notebook, you can input the command `%pylab inline` as part of -your notebook before you start to try Spark from the IPython notebook. +your notebook before you start to try Spark from the Jupyter notebook. </div> --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
