This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch branch-4.0
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.0 by this push:
     new 3c88f32583e9 [SPARK-52942][YARN][BUILD] YARN External Shuffle Service 
jar should include `scala-library`
3c88f32583e9 is described below

commit 3c88f32583e9d6da3db6e2f55a6236b4b9bf509b
Author: Cheng Pan <cheng...@apache.org>
AuthorDate: Wed Aug 6 12:47:09 2025 +0800

    [SPARK-52942][YARN][BUILD] YARN External Shuffle Service jar should include 
`scala-library`
    
    ### What changes were proposed in this pull request?
    
    Since SPARK-41400, the `common/network-yarn` module has started to hard 
depend on Scala, now it causes YARN Resource Manager to fail to start due to 
missing `scala-library`.
    
    ```
    2025-07-24 09:55:38,369 INFO util.ApplicationClassLoader: classpath: 
[file:/opt/spark/yarn/spark-4.1.0-SNAPSHOT-yarn-shuffle.jar]
    2025-07-24 09:55:38,369 INFO util.ApplicationClassLoader: system classes: 
[java., javax.accessibility., -javax.activation., javax.activity., 
javax.annotation., javax.annotation.processing., javax.crypto., javax.ima
    geio., javax.jws., javax.lang.model., -javax.management.j2ee., 
javax.management., javax.naming., javax.net., javax.print., javax.rmi., 
javax.script., -javax.security.auth.message., javax.security.auth., javax.secur
    ity.cert., javax.security.sasl., javax.sound., javax.sql., javax.swing., 
javax.tools., javax.transaction., -javax.xml.registry., -javax.xml.rpc., 
javax.xml., org.w3c.dom., org.xml.sax., org.apache.commons.logging.,
     org.apache.log4j., -org.apache.hadoop.hbase., org.apache.hadoop., 
core-default.xml, hdfs-default.xml, mapred-default.xml, yarn-default.xml]
    2025-07-24 09:55:38,538 INFO yarn.YarnShuffleService: Initializing YARN 
shuffle service for Spark
    2025-07-24 09:55:38,539 WARN containermanager.AuxServices: The Auxiliary 
Service named 'spark_shuffle' in the configuration is for class 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWi
    thCustomClassLoader which has a name of 
'org.apache.spark.network.yarn.YarnShuffleService with custom class loader'. 
Because these are not the same tools trying to send ServiceData and read 
Service Meta Data may ha
    ve issues unless the refer to the name in the config.
    2025-07-24 09:55:38,808 ERROR nodemanager.NodeManager: Error starting 
NodeManager
    java.lang.NoClassDefFoundError: scala/Product
            at java.base/java.lang.ClassLoader.defineClass1(Native Method)
            at 
java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
            at 
java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
            at 
java.base/java.net.URLClassLoader.defineClass(URLClassLoader.java:524)
            at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:427)
            at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:421)
            at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
            at 
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:420)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:176)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
            at 
org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:330)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:64)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.initAuxService(AuxServices.java:475)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:758)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:110)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:336)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:110)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:501)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:969)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1058)
    Caused by: java.lang.ClassNotFoundException: scala.Product
            at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
            at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
            at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
            ... 25 more
    2025-07-24 09:55:38,815 INFO nodemanager.NodeManager: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NodeManager at 
hadoop-worker1.orb.local/192.168.97.6
    ************************************************************/
    ```
    
    Note: now `spark-<version>-yarn-shuffle.jar` is ~100m, while previously it 
is ~10m. in `common/utils`, the Java and Scala code cross references each 
other, so we can not simply split it into one Java utils and one Scala utils 
modules, thus it's not easy to make `spark-<version>-yarn-shuffle.jar` to be 
scala-free as before.
    
    ### Why are the changes needed?
    
    Bug fix, recover a broken feature.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, recover a broken feature.
    
    ### How was this patch tested?
    
    Tested on a YARN cluster, RM starts successfully after patching.
    
    Note: Spark 4 requires JDK 17 or later, but JDK 17 is not officially 
supported as of Hadoop 3.4.1. The Hadoop community has been actively working on 
supporting JDK 17 in recent months, and it almost works fine in 3.4.2.
    
    For reviewers who expect to verify this locally, consider using 3.4.2 RC1 
[1]
    
    [1] https://lists.apache.org/thread/f66vj3rj6cpk37gb1jfl2ombq3hltsml
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #51650 from pan3793/SPARK-52942.
    
    Authored-by: Cheng Pan <cheng...@apache.org>
    Signed-off-by: yangjie01 <yangji...@baidu.com>
    (cherry picked from commit e964086649f13f2e329948c629d97ba3923ed51b)
    Signed-off-by: yangjie01 <yangji...@baidu.com>
---
 common/network-yarn/pom.xml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index fd47138fd69a..4dce977c0c80 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -99,9 +99,6 @@
             <includes>
               <include>*:*</include>
             </includes>
-            <excludes>
-              <exclude>org.scala-lang:scala-library</exclude>
-            </excludes>
           </artifactSet>
           <filters>
             <filter>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to