This is an automated email from the ASF dual-hosted git repository.
dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 22932385df3 [SPARK-46020][INFRA] Add `Python 3.12` to Infra docker
image
22932385df3 is described below
commit 22932385df3f5df216dcc510b4cdecad343f7642
Author: Dongjoon Hyun <[email protected]>
AuthorDate: Mon Nov 20 21:41:35 2023 -0800
[SPARK-46020][INFRA] Add `Python 3.12` to Infra docker image
### What changes were proposed in this pull request?
This PR aims to add `Python 3.12` to Infra docker images.
Note that `Python 3.12` has a breaking change in the installation.
- `distutils` module itself is removed at Python 3.12 via
[PEP-632](https://peps.python.org/pep-0632) in favor of `packaging` package.
- Apache Spark 4.0.0 is ready for Python 3.12 via SPARK-45390 by removing
`distutils` usages
- https://github.com/apache/spark/pull/43192
- However, some 3rd party packages are not ready for Python 3.12. So, this
PR skips those kind of packages.
### Why are the changes needed?
This PR is a preparation to add a daily `Python 3.12` GitHub Action job
later for Apache Spark 4.0.0.
As of today, Apache Spark 4.0.0 has Python 3.8 ~ Python 3.11 test coverage.
- Python 3.9 (Main)
-
https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml
- PyPy3.8, Python 3.10, Python 3.11 (Daily)
- https://github.com/apache/spark/actions/workflows/build_python.yml
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
```
$ docker run -it --rm
ghcr.io/dongjoon-hyun/apache-spark-ci-image:master-6939290578 python3.12
--version
Python 3.12.0
$ docker run -it --rm
ghcr.io/dongjoon-hyun/apache-spark-ci-image:master-6939290578 python3.12 -m pip
freeze
alembic==1.12.1
blinker==1.7.0
certifi==2019.11.28
chardet==3.0.4
charset-normalizer==3.3.2
click==8.1.7
cloudpickle==2.2.1
contourpy==1.2.0
coverage==7.3.2
cycler==0.12.1
databricks-cli==0.18.0
dbus-python==1.2.16
distro-info==0.23+ubuntu1.1
docker==6.1.3
entrypoints==0.4
et-xmlfile==1.1.0
Flask==3.0.0
fonttools==4.45.0
gitdb==4.0.11
GitPython==3.1.40
googleapis-common-protos==1.56.4
greenlet==3.0.1
gunicorn==21.2.0
idna==2.8
importlib-metadata==6.8.0
itsdangerous==2.1.2
Jinja2==3.1.2
joblib==1.3.2
kiwisolver==1.4.5
lxml==4.9.3
Mako==1.3.0
Markdown==3.5.1
MarkupSafe==2.1.3
matplotlib==3.8.2
mlflow==2.8.1
numpy==1.26.2
oauthlib==3.2.2
openpyxl==3.1.2
packaging==23.2
pandas==2.1.3
Pillow==10.1.0
plotly==5.18.0
protobuf==4.25.1
pyarrow==14.0.1
PyGObject==3.36.0
PyJWT==2.8.0
pyparsing==3.1.1
python-apt==2.0.1+ubuntu0.20.4.1
python-dateutil==2.8.2
pytz==2023.3.post1
PyYAML==6.0.1
querystring-parser==1.2.4
requests==2.31.0
requests-unixsocket==0.2.0
scikit-learn==1.3.2
scipy==1.11.4
setuptools==45.2.0
six==1.14.0
smmap==5.0.1
SQLAlchemy==2.0.23
sqlparse==0.4.4
tabulate==0.9.0
tenacity==8.2.3
threadpoolctl==3.2.0
typing_extensions==4.8.0
tzdata==2023.3
unattended-upgrades==0.1
unittest-xml-reporting==3.2.0
urllib3==2.1.0
websocket-client==1.6.4
Werkzeug==3.0.1
wheel==0.34.2
zipp==3.17.0
```
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #43922 from dongjoon-hyun/SPARK-46020.
Authored-by: Dongjoon Hyun <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
---
dev/infra/Dockerfile | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/dev/infra/Dockerfile b/dev/infra/Dockerfile
index 141c079f393..5cf492ad863 100644
--- a/dev/infra/Dockerfile
+++ b/dev/infra/Dockerfile
@@ -127,3 +127,12 @@ RUN python3.11 -m pip install 'grpcio>=1.48,<1.57'
'grpcio-status>=1.48,<1.57' '
RUN python3.11 -m pip install 'torch<=2.0.1' torchvision --index-url
https://download.pytorch.org/whl/cpu
RUN python3.11 -m pip install torcheval
RUN python3.11 -m pip install deepspeed
+
+# Install Python 3.12 at the last stage to avoid breaking the existing Python
installations
+RUN add-apt-repository ppa:deadsnakes/ppa
+RUN apt-get update && apt-get install -y \
+ python3.12 python3.12-distutils \
+ && rm -rf /var/lib/apt/lists/*
+RUN curl -sS https://bootstrap.pypa.io/get-pip.py | python3.12
+RUN python3.12 -m pip install numpy 'pyarrow>=14.0.0' 'pandas<=2.1.3' scipy
unittest-xml-reporting plotly>=4.8 'mlflow>=2.8.1' coverage matplotlib openpyxl
'scikit-learn>=1.3.2'
+RUN python3.12 -m pip install 'protobuf==4.25.1'
'googleapis-common-protos==1.56.4'
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]