This is an automated email from the ASF dual-hosted git repository.

yangjie01 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new e964086649f1 [SPARK-52942][YARN][BUILD] YARN External Shuffle Service 
jar should include `scala-library`
e964086649f1 is described below

commit e964086649f13f2e329948c629d97ba3923ed51b
Author: Cheng Pan <cheng...@apache.org>
AuthorDate: Wed Aug 6 12:47:09 2025 +0800

    [SPARK-52942][YARN][BUILD] YARN External Shuffle Service jar should include 
`scala-library`
    
    ### What changes were proposed in this pull request?
    
    Since SPARK-41400, the `common/network-yarn` module has started to hard 
depend on Scala, now it causes YARN Resource Manager to fail to start due to 
missing `scala-library`.
    
    ```
    2025-07-24 09:55:38,369 INFO util.ApplicationClassLoader: classpath: 
[file:/opt/spark/yarn/spark-4.1.0-SNAPSHOT-yarn-shuffle.jar]
    2025-07-24 09:55:38,369 INFO util.ApplicationClassLoader: system classes: 
[java., javax.accessibility., -javax.activation., javax.activity., 
javax.annotation., javax.annotation.processing., javax.crypto., javax.ima
    geio., javax.jws., javax.lang.model., -javax.management.j2ee., 
javax.management., javax.naming., javax.net., javax.print., javax.rmi., 
javax.script., -javax.security.auth.message., javax.security.auth., javax.secur
    ity.cert., javax.security.sasl., javax.sound., javax.sql., javax.swing., 
javax.tools., javax.transaction., -javax.xml.registry., -javax.xml.rpc., 
javax.xml., org.w3c.dom., org.xml.sax., org.apache.commons.logging.,
     org.apache.log4j., -org.apache.hadoop.hbase., org.apache.hadoop., 
core-default.xml, hdfs-default.xml, mapred-default.xml, yarn-default.xml]
    2025-07-24 09:55:38,538 INFO yarn.YarnShuffleService: Initializing YARN 
shuffle service for Spark
    2025-07-24 09:55:38,539 WARN containermanager.AuxServices: The Auxiliary 
Service named 'spark_shuffle' in the configuration is for class 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWi
    thCustomClassLoader which has a name of 
'org.apache.spark.network.yarn.YarnShuffleService with custom class loader'. 
Because these are not the same tools trying to send ServiceData and read 
Service Meta Data may ha
    ve issues unless the refer to the name in the config.
    2025-07-24 09:55:38,808 ERROR nodemanager.NodeManager: Error starting 
NodeManager
    java.lang.NoClassDefFoundError: scala/Product
            at java.base/java.lang.ClassLoader.defineClass1(Native Method)
            at 
java.base/java.lang.ClassLoader.defineClass(ClassLoader.java:1017)
            at 
java.base/java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150)
            at 
java.base/java.net.URLClassLoader.defineClass(URLClassLoader.java:524)
            at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:427)
            at java.base/java.net.URLClassLoader$1.run(URLClassLoader.java:421)
            at 
java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
            at 
java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:420)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:176)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
            at 
org.apache.spark.network.yarn.YarnShuffleService.serviceInit(YarnShuffleService.java:330)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxiliaryServiceWithCustomClassLoader.serviceInit(AuxiliaryServiceWithCustomClassLoader.java:64)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.initAuxService(AuxServices.java:475)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.AuxServices.serviceInit(AuxServices.java:758)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:110)
            at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl.serviceInit(ContainerManagerImpl.java:336)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:110)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.serviceInit(NodeManager.java:501)
            at 
org.apache.hadoop.service.AbstractService.init(AbstractService.java:165)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.initAndStartNodeManager(NodeManager.java:969)
            at 
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:1058)
    Caused by: java.lang.ClassNotFoundException: scala.Product
            at 
java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
            at 
java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
            at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:525)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:189)
            at 
org.apache.hadoop.util.ApplicationClassLoader.loadClass(ApplicationClassLoader.java:157)
            ... 25 more
    2025-07-24 09:55:38,815 INFO nodemanager.NodeManager: SHUTDOWN_MSG:
    /************************************************************
    SHUTDOWN_MSG: Shutting down NodeManager at 
hadoop-worker1.orb.local/192.168.97.6
    ************************************************************/
    ```
    
    Note: now `spark-<version>-yarn-shuffle.jar` is ~100m, while previously it 
is ~10m. in `common/utils`, the Java and Scala code cross references each 
other, so we can not simply split it into one Java utils and one Scala utils 
modules, thus it's not easy to make `spark-<version>-yarn-shuffle.jar` to be 
scala-free as before.
    
    ### Why are the changes needed?
    
    Bug fix, recover a broken feature.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, recover a broken feature.
    
    ### How was this patch tested?
    
    Tested on a YARN cluster, RM starts successfully after patching.
    
    Note: Spark 4 requires JDK 17 or later, but JDK 17 is not officially 
supported as of Hadoop 3.4.1. The Hadoop community has been actively working on 
supporting JDK 17 in recent months, and it almost works fine in 3.4.2.
    
    For reviewers who expect to verify this locally, consider using 3.4.2 RC1 
[1]
    
    [1] https://lists.apache.org/thread/f66vj3rj6cpk37gb1jfl2ombq3hltsml
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No.
    
    Closes #51650 from pan3793/SPARK-52942.
    
    Authored-by: Cheng Pan <cheng...@apache.org>
    Signed-off-by: yangjie01 <yangji...@baidu.com>
---
 common/network-yarn/pom.xml | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/common/network-yarn/pom.xml b/common/network-yarn/pom.xml
index 78289684960e..93998198c9ce 100644
--- a/common/network-yarn/pom.xml
+++ b/common/network-yarn/pom.xml
@@ -99,9 +99,6 @@
             <includes>
               <include>*:*</include>
             </includes>
-            <excludes>
-              <exclude>org.scala-lang:scala-library</exclude>
-            </excludes>
           </artifactSet>
           <filters>
             <filter>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to