Re: [PR] Add Spark 4.0.0 release notes [spark-website]

via GitHub Sat, 24 May 2025 06:23:50 -0700


pan3793 commented on code in PR #608:
URL: https://github.com/apache/spark-website/pull/608#discussion_r2104990450



##########
js/downloads.js:
##########
@@ -14,11 +14,15 @@ function addRelease(version, releaseDate, packages, 
mirrored) {
 var sources = {pretty: "Source Code", tag: "sources"};
 var hadoopFree = {pretty: "Pre-built with user-provided Apache Hadoop", tag: 
"without-hadoop"};
 var hadoop3p = {pretty: "Pre-built for Apache Hadoop 3.3 and later", tag: 
"hadoop3"};
+var hadoop3pSparkConnect = {pretty: "Pre-built for Apache Hadoop 3.3 and later 
with Spark Connect enabled", tag: "hadoop3-connect"};

Review Comment:
   might be "Apache Hadoop 3.4 and later"? suppose that Hadoop requires client 
version must <= server version



##########
releases/_posts/2025-05-23-spark-release-4-0-0.md:
##########
@@ -0,0 +1,694 @@
+---
+layout: post
+title: Spark Release 4.0.0
+categories: []
+tags: []
+status: publish
+type: post
+published: true
+meta:
+  _edit_last: '4'
+  _wpas_done_all: '1'
+---
+
+Apache Spark 4.0.0 marks a significant milestone as the inaugural release in 
the 4.x series, embodying the collective effort of the vibrant open-source 
community. This release is a testament to tremendous collaboration, resolving 
over 5100 tickets with contributions from more than 390 individuals.
+
+Spark Connect continues its rapid advancement, delivering substantial 
improvements:
+- A new lightweight Python client 
([pyspark-client](https://pypi.org/project/pyspark-client)) at just 1.5 MB.
+- Full API compatibility for the Java client.
+- Greatly expanded API coverage.
+- ML on Spark Connect.
+- A new client implementation for 
[Swift](https://github.com/apache/spark-connect-swift).
+
+Spark SQL is significantly enriched with powerful new features designed to 
boost expressiveness and versatility for SQL workloads, such as VARIANT data 
type support, SQL user-defined functions, session variables, pipe syntax, and 
string collation.
+
+PySpark sees continuous dedication to both its functional breadth and the 
overall developer experience, bringing a native plotting API, a new Python Data 
Source API, support for Python UDTFs, and unified profiling for PySpark UDFs, 
alongside numerous other enhancements.
+
+Structured Streaming evolves with key additions that provide greater control 
and ease of debugging, notably the introduction of the Arbitrary State API v2 
for more flexible state management and the State Data Source for easier 
debugging.
+
+To download Apache Spark 4.0.0, please visit the 
[downloads](https://spark.apache.org/downloads.html) page. For [detailed 
changes](https://issues.apache.org/jira/projects/SPARK/versions/12353359), you 
can consult JIRA. We have also curated a list of high-level changes here, 
grouped by major modules.
+
+
+* This will become a table of contents (this text will be scraped).
+{:toc}
+
+
+### Core and Spark SQL Highlights
+
+- [[SPARK-45314]](https://issues.apache.org/jira/browse/SPARK-45314) Drop 
Scala 2.12 and make Scala 2.13 the default
+- [[SPARK-45315]](https://issues.apache.org/jira/browse/SPARK-45315) Drop JDK 
8/11 and make JDK 17 the default

Review Comment:
   now sure if Java 21 support should be mentioned together



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Add Spark 4.0.0 release notes [spark-website]

Reply via email to