This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/master by this push:
     new 1d575f2ae9ba [SPARK-54067][CORE] Improve `SparkSubmit` to invoke 
`exitFn` with the root cause instead of `SparkUserAppException`
1d575f2ae9ba is described below

commit 1d575f2ae9baa33a571c74afd9abc19670007f03
Author: Sandy Ryza <[email protected]>
AuthorDate: Thu Oct 30 10:45:28 2025 -0700

    [SPARK-54067][CORE] Improve `SparkSubmit` to invoke `exitFn` with the root 
cause instead of `SparkUserAppException`
    
    ### What changes were proposed in this pull request?
    
    Hides the `SparkUserAppException` and stack trace when a pipeline run fails.
    
    ### Why are the changes needed?
    
    I hit this when I ran a pipeline that had no flows:
    ```
    org.apache.spark.SparkUserAppException: User application exited with 1
    at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:127)
    at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
    at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
    at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:569)
    at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1028)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:203)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:226)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:95)
    at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1166)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1175)
    at org.apache.spark.deploy.SparkPipelines$.main(SparkPipelines.scala:42)
    at org.apache.spark.deploy.SparkPipelines.main(SparkPipelines.scala)
    ```
    
    This is not information that's relevant to the user.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Not for anything that's been released.
    
    ### How was this patch tested?
    
    Ran the CLI and observed this error was gone and the other output remained 
the same:
    
    ```
    > spark-pipelines run --conf spark.sql.catalogImplementation=hive
    WARNING: Using incubator modules: jdk.incubator.vector
    2025-10-28 13:22:49: Loading pipeline spec from 
/Users/sandy.ryza/sdp-test/demo2/pipeline.yml...
    2025-10-28 13:22:49: Creating Spark session...
    WARNING: Using incubator modules: jdk.incubator.vector
    Using Spark's default log4j profile: 
org/apache/spark/log4j2-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use 
setLogLevel(newLevel).
    25/10/28 13:22:50 WARN NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
    
/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/conf.py:64: 
UserWarning: Failed to set spark.sql.catalogImplementation to Some(hive) due to 
[CANNOT_MODIFY_STATIC_CONFIG] Cannot modify the value of the static Spark 
config: "spark.sql.catalogImplementation". SQLSTATE: 46110
    2025-10-28 13:22:53: Creating dataflow graph...
    2025-10-28 13:22:53: Registering graph elements...
    2025-10-28 13:22:53: Loading definitions. Root directory: 
'/Users/sandy.ryza/sdp-test/demo2'.
    2025-10-28 13:22:53: Found 2 files matching glob 'transformations/**/*'
    2025-10-28 13:22:53: Importing 
/Users/sandy.ryza/sdp-test/demo2/transformations/example_python_materialized_view.py...
    2025-10-28 13:22:53: Registering SQL file 
/Users/sandy.ryza/sdp-test/demo2/transformations/example_sql_materialized_view.sql...
    2025-10-28 13:22:53: Starting run...
    25/10/28 13:22:55 WARN ObjectStore: Version information not found in 
metastore. hive.metastore.schema.verification is not enabled so recording the 
schema version 2.3.0
    25/10/28 13:22:55 WARN ObjectStore: setMetaStoreSchemaVersion called but 
recording version is disabled: version = 2.3.0, comment = Set by MetaStore 
sandy.ryza10.15.139.54
    Traceback (most recent call last):
      File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 413, 
in <module>
        run(
      File "/Users/sandy.ryza/oss/python/pyspark/pipelines/cli.py", line 340, 
in run
        handle_pipeline_events(result_iter)
      File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/pipelines/spark_connect_pipeline.py",
 line 53, in handle_pipeline_events
        for result in iter:
      File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1186, in execute_command_as_iterator
      File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1619, in _execute_and_fetch_as_iterator
      File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1893, in _handle_error
      File 
"/Users/sandy.ryza/oss/python/lib/pyspark.zip/pyspark/sql/connect/client/core.py",
 line 1966, in _handle_rpc_error
    pyspark.errors.exceptions.connect.AnalysisException: 
[PIPELINE_DATASET_WITHOUT_FLOW] Pipeline dataset 
`spark_catalog`.`default`.`abc` does not have any defined flows. Please attach 
a query with the dataset's definition, or explicitly define at least one flow 
that writes to the dataset. SQLSTATE: 0A000
    25/10/28 13:22:57 INFO ShutdownHookManager: Shutdown hook called
    25/10/28 13:22:57 INFO ShutdownHookManager: Deleting directory 
/private/var/folders/1v/dqhbgmt10vl6v3tdlwvvx90r0000gp/T/spark-1214d042-270d-407f-8324-0dfcdf72c38c
    ```
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Closes #52770 from sryza/user-app-exited-error.
    
    Authored-by: Sandy Ryza <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala 
b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
index 07d764966f9e..b5d026e39a90 100644
--- a/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala
@@ -1166,7 +1166,7 @@ object SparkSubmit extends CommandLineUtils with Logging {
           super.doSubmit(args)
         } catch {
           case e: SparkUserAppException =>
-            exitFn(e.exitCode, Some(e))
+            exitFn(e.exitCode, Option(e.getCause))
         }
       }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to