[PR] build: add spark-4.1 profile and enable Spark 4.1.1 SQL tests [datafusion-comet]

via GitHub Sun, 26 Apr 2026 07:08:18 -0700


andygrove opened a new pull request, #4093:
URL: https://github.com/apache/datafusion-comet/pull/4093


   ## Which issue does this PR close?
   
   <!-- no tracking issue yet -->
   
   ## Rationale for this change
   
   We currently run the Spark SQL test suites against 3.4.3, 3.5.8, and 4.0.1 
in CI. This PR is a first step toward also running them against Spark 4.1.1 so 
we can catch incompatibilities introduced by that release as early as possible.
   
   ## What changes are included in this PR?
   
   - Add a `spark-4.1` Maven profile to the root `pom.xml` and `spark/pom.xml`. 
The profile targets `spark.version=4.1.1`, Scala 2.13, JDK 17, and (for now) 
reuses the `spark-4.0` shim sources via `shims.majorVerSrc=spark-4.0`. Iceberg 
runtime and Jetty test deps mirror the `spark-4.0` profile.
   - Add `dev/diffs/4.1.1.diff`, generated by applying `dev/diffs/4.0.1.diff` 
to the `v4.1.1` Spark tag with `git apply --reject` and resolving rejects 
manually. Most rejects were import-only differences caused by 
surrounding-context changes in 4.1.1 (for example, `ShuffleExchangeExec` vs 
`ShuffleExchangeLike`, additional comet imports, and the `IgnoreComet` 
annotations). The `pom.xml` portion of the diff sets `spark.version.short=4.1` 
so the patched Spark build pulls the new `comet-spark-spark4.1_2.13` artifact.
   - Add a `{spark-short: '4.1', spark-full: '4.1.1', java: 17, scan-impl: 
'auto'}` entry to the `spark_sql_test.yml` matrix.
   
   Known follow-ups (not in this PR):
   - Shim sources are reused from `spark-4.0`. If 4.1 has breaking changes in 
shimmed APIs, a separate `spark-4.1` shim tree will be needed.
   - The diff has not been built or run end-to-end. CI will surface the first 
round of incompatibilities.
   - `iceberg-spark-runtime-4.1_2.13:1.10.0` mirrors the 4.0 dep. If that 
artifact is not yet published, the dep version will need an update.
   
   ## How are these changes tested?
   
   This PR enables the existing Spark SQL CI workflow against Spark 4.1.1, so 
the test results from this PR are themselves the test plan. The full SQL/Hive 
matrix runs as part of `Spark SQL Tests`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] build: add spark-4.1 profile and enable Spark 4.1.1 SQL tests [datafusion-comet]

Reply via email to