This is an automated email from the ASF dual-hosted git repository.
yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 44d2c86e71fc [SPARK-45593][BUILD] Building a runnable distribution
from master code running spark-sql raise error
44d2c86e71fc is described below
commit 44d2c86e71fca7044e6d5d9e9222eecff17c360c
Author: yikaifei <[email protected]>
AuthorDate: Thu Jan 18 11:32:01 2024 +0800
[SPARK-45593][BUILD] Building a runnable distribution from master code
running spark-sql raise error
### What changes were proposed in this pull request?
Fix a build issue, when building a runnable distribution from master code
running spark-sql raise error:
```
Caused by: java.lang.ClassNotFoundException:
org.sparkproject.guava.util.concurrent.internal.InternalFutureFailureAccess
at
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
at
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
... 58 more
```
the problem is due to a gauva dependency in spark-connect-common POM that
**conflicts** with the shade plugin of the parent pom.
- the spark-connect-common contains `connect.guava.version` version of
guava, and it is relocation as `${spark.shade.packageName}.guava` not the
`${spark.shade.packageName}.connect.guava`;
- The spark-network-common also contains guava related classes, it has also
been relocation is `${spark.shade.packageName}.guava`, but guava version
`${guava.version}`;
- As a result, in the presence of different versions of the classpath
org.sparkproject.guava.xx;
In addition, after investigation, it seems that module spark-connect-common
is not related to guava, so we can remove guava dependency from
spark-connect-common.
### Why are the changes needed?
Building a runnable distribution from master code is not runnable.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
I ran the build command output a runnable distribution package manually for
the tests;
Build command:
```
./dev/make-distribution.sh --name ui --pip --tgz -Phive
-Phive-thriftserver -Pyarn -Pconnect
```
Test result:
<img width="1276" alt="image"
src="https://github.com/apache/spark/assets/51110188/aefbc433-ea5c-4287-8ebd-367806043ac8">
I also checked the `org.sparkproject.guava.cache.LocalCache` from jars dir;
Before:
```
➜ jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./
.//spark-connect_2.13-4.0.0-SNAPSHOT.jar
.//spark-network-common_2.13-4.0.0-SNAPSHOT.jar
.//spark-connect-common_2.13-4.0.0-SNAPSHOT.jar
```
Now:
```
➜ jars grep -lr 'org.sparkproject.guava.cache.LocalCache' ./
.//spark-network-common_2.13-4.0.0-SNAPSHOT.jar
```
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #43436 from Yikf/SPARK-45593.
Authored-by: yikaifei <[email protected]>
Signed-off-by: yangjie01 <[email protected]>
---
assembly/pom.xml | 6 ++++++
connector/connect/client/jvm/pom.xml | 8 +-------
connector/connect/common/pom.xml | 34 ++++++++++++++++++++++++++++++++++
connector/connect/server/pom.xml | 25 -------------------------
4 files changed, 41 insertions(+), 32 deletions(-)
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 77ff87c17f52..cd8c3fca9d23 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -149,6 +149,12 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect_${scala.binary.version}</artifactId>
<version>${project.version}</version>
+ <exclusions>
+ <exclusion>
+ <groupId>org.apache.spark</groupId>
+
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
+ </exclusion>
+ </exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
diff --git a/connector/connect/client/jvm/pom.xml
b/connector/connect/client/jvm/pom.xml
index 8057a33df178..9bedebf523a7 100644
--- a/connector/connect/client/jvm/pom.xml
+++ b/connector/connect/client/jvm/pom.xml
@@ -51,15 +51,9 @@
<version>${project.version}</version>
</dependency>
<!--
- We need to define guava and protobuf here because we need to change the
scope of both from
+ We need to define protobuf here because we need to change the scope of
both from
provided to compile. If we don't do this we can't shade these libraries.
-->
- <dependency>
- <groupId>com.google.guava</groupId>
- <artifactId>guava</artifactId>
- <version>${connect.guava.version}</version>
- <scope>compile</scope>
- </dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
diff --git a/connector/connect/common/pom.xml b/connector/connect/common/pom.xml
index a374646f8f29..336d83e04c15 100644
--- a/connector/connect/common/pom.xml
+++ b/connector/connect/common/pom.xml
@@ -47,6 +47,11 @@
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
</dependency>
+ <!--
+ SPARK-45593: spark connect relies on a specific version of Guava, We
perform shading
+ of the Guava library within the connect-common module to ensure both
connect-server and
+ connect-client modules maintain consistent and accurate Guava
dependencies.
+ -->
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
@@ -145,6 +150,35 @@
</execution>
</executions>
</plugin>
+ <plugin>
+ <groupId>org.apache.maven.plugins</groupId>
+ <artifactId>maven-shade-plugin</artifactId>
+ <configuration>
+ <shadedArtifactAttached>false</shadedArtifactAttached>
+ <artifactSet>
+ <includes>
+ <include>org.spark-project.spark:unused</include>
+ <include>com.google.guava:guava</include>
+ <include>com.google.guava:failureaccess</include>
+
<include>org.apache.tomcat:annotations-api</include>
+ </includes>
+ </artifactSet>
+ <relocations>
+ <relocation>
+ <pattern>com.google.common</pattern>
+
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
+ </relocation>
+ </relocations>
+ </configuration>
+ <executions>
+ <execution>
+ <phase>package</phase>
+ <goals>
+ <goal>shade</goal>
+ </goals>
+ </execution>
+ </executions>
+ </plugin>
</plugins>
</build>
<profiles>
diff --git a/connector/connect/server/pom.xml b/connector/connect/server/pom.xml
index e9c7bd86e0f7..82127f736ccb 100644
--- a/connector/connect/server/pom.xml
+++ b/connector/connect/server/pom.xml
@@ -51,12 +51,6 @@
<groupId>org.apache.spark</groupId>
<artifactId>spark-connect-common_${scala.binary.version}</artifactId>
<version>${project.version}</version>
- <exclusions>
- <exclusion>
- <groupId>com.google.guava</groupId>
- <artifactId>guava</artifactId>
- </exclusion>
- </exclusions>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
@@ -156,17 +150,6 @@
<groupId>org.scala-lang.modules</groupId>
<artifactId>scala-parallel-collections_${scala.binary.version}</artifactId>
</dependency>
- <dependency>
- <groupId>com.google.guava</groupId>
- <artifactId>guava</artifactId>
- <version>${connect.guava.version}</version>
- <scope>compile</scope>
- </dependency>
- <dependency>
- <groupId>com.google.guava</groupId>
- <artifactId>failureaccess</artifactId>
- <version>${guava.failureaccess.version}</version>
- </dependency>
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
@@ -287,7 +270,6 @@
<shadedArtifactAttached>false</shadedArtifactAttached>
<artifactSet>
<includes>
- <include>com.google.guava:*</include>
<include>io.grpc:*:</include>
<include>com.google.protobuf:*</include>
@@ -307,13 +289,6 @@
</includes>
</artifactSet>
<relocations>
- <relocation>
- <pattern>com.google.common</pattern>
-
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
- <includes>
- <include>com.google.common.**</include>
- </includes>
- </relocation>
<relocation>
<pattern>com.google.thirdparty</pattern>
<shadedPattern>${spark.shade.packageName}.connect.guava</shadedPattern>
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]