This is an automated email from the ASF dual-hosted git repository. chengpan pushed a commit to branch branch-0.12 in repository https://gitbox.apache.org/repos/asf/zeppelin.git
The following commit(s) were added to refs/heads/branch-0.12 by this push: new 309f174feb [ZEPPELIN-6199] Fix tutorial document file download not working 309f174feb is described below commit 309f174febf8b442866431280f093c88a1e8753b Author: Ruin09 <meme...@naver.com> AuthorDate: Mon Jul 14 03:16:19 2025 +0900 [ZEPPELIN-6199] Fix tutorial document file download not working ### What is this PR for? Currently, download link of `bank.zip` file in [tutorial page](https://zeppelin.apache.org/docs/0.12.0/quickstart/tutorial.html) is broken. The root cause is a change in the URL for the UCI Machine Learning dataset. The previous link, http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip, is no longer valid. The new dataset for the same ID (222) is now located at https://archive.ics.uci.edu/dataset/222/bank+marketing. Additionally, `bank.zip` is no longer offered as a standalone file. It is now nested inside a `bank+marketing.zip` archive. This PR updates the tutorial to: * Replace the broken link with the new, correct URL for the `bank+marketing.zip` file. * Add a clear instruction for users to first unzip the main `bank+marketing.zip` archive to find and use the required `bank.zip` file within it. ### What type of PR is it? Bug Fix ### Todos ### What is the Jira issue? [ZEPPELIN-6199] ### How should this be tested? * Run the fixed document locally with docker. * Download the file in this fixed page, and run related tutorials with this file in zeppelin notebook. ### Screenshots (if appropriate) * Run tutorial with new `bank.zip` file. <img width="2491" height="1156" alt="Tutorial Test Result" src="https://github.com/user-attachments/assets/1018f2ae-8cd1-475c-9bdd-015b8cd4b362" /> * Add a new instruction to use the data file. <img width="910" height="637" alt="Fixed Tutorial Page" src="https://github.com/user-attachments/assets/40703945-6cde-48b4-8060-c00d83278348" /> ### Questions: * Does the license files need to update? No * Is there breaking changes for older versions? No * Does this needs documentation? No Closes #4966 from shmruin/ZEPPELIN-6199. Signed-off-by: Cheng Pan <cheng...@apache.org> (cherry picked from commit d8b2f0c9d7ad5ac2e88de33e2d1963da56d47d47) Signed-off-by: Cheng Pan <cheng...@apache.org> --- docs/quickstart/tutorial.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/docs/quickstart/tutorial.md b/docs/quickstart/tutorial.md index 218b9aace1..ce19780890 100644 --- a/docs/quickstart/tutorial.md +++ b/docs/quickstart/tutorial.md @@ -31,7 +31,9 @@ Current main backend processing engine of Zeppelin is [Apache Spark](https://spa ### Data Refine -Before you start Zeppelin tutorial, you will need to download [bank.zip](http://archive.ics.uci.edu/ml/machine-learning-databases/00222/bank.zip). +Before you start Zeppelin tutorial, you will need to download [bank+marketing.zip](https://archive.ics.uci.edu/static/public/222/bank+marketing.zip). + +Unzip `bank+marketing.zip` and then use `bank.zip` file found inside. First, to transform csv format data into RDD of `Bank` objects, run following script. This will also remove header using `filter` function.