puchengy commented on issue #12792: URL: https://github.com/apache/iceberg/issues/12792#issuecomment-2847360867
Thanks for filing the issue @lkindere. @singhpk234 thank you for your proposed fix, I have encountered similar issue and I believe this is relevant. I am using Spark 3.2 + Iceberg 1.3.0 ``` [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - diagnostics: User class threw exception: org.apache.spark.SparkException: Writing job aborted [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.errors.QueryExecutionErrors$.writingJobAbortedError(QueryExecutionErrors.scala:613) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:386) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2$(WriteToDataSourceV2Exec.scala:330) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.ReplaceDataExec.writeWithV2(ReplaceDataExec.scala:29) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:309) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run$(WriteToDataSourceV2Exec.scala:308) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.ReplaceDataExec.run(ReplaceDataExec.scala:29) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:139) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:111) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:172) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:97) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:776) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:70) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:139) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:135) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30) [2025-03-27 20:46:15,164+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:457) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:135) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:122) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:120) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.Dataset.<init>(Dataset.scala:219) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:776) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:619) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:776) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:614) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:68) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:389) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:517) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:511) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.Iterator.foreach(Iterator.scala:941) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.Iterator.foreach$(Iterator.scala:941) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.IterableLike.foreach(IterableLike.scala:74) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.IterableLike.foreach$(IterableLike.scala:73) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at scala.collection.AbstractIterable.foreach(Iterable.scala:56) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:511) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:336) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:474) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:490) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:213) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at java.lang.reflect.Method.invoke(Method.java:498) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:737) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - Caused by: org.apache.iceberg.exceptions.RESTException: Error occurred while processing POST request [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:304) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:219) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.HTTPClient.post(HTTPClient.java:330) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.RESTClient.post(RESTClient.java:112) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.RESTTableOperations.commit(RESTTableOperations.java:144) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.SnapshotProducer.lambda$commit$2(SnapshotProducer.java:394) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.util.Tasks$Builder.runTaskWithRetry(Tasks.java:413) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.util.Tasks$Builder.runSingleThreaded(Tasks.java:219) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:203) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.util.Tasks$Builder.run(Tasks.java:196) [2025-03-27 20:46:15,165+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:368) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.BaseOverwriteFiles.commit(BaseOverwriteFiles.java:31) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.spark.source.SparkWrite.commitOperation(SparkWrite.java:210) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.spark.source.SparkWrite.access$1300(SparkWrite.java:83) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.spark.source.SparkWrite$CopyOnWriteOperation.commitWithSerializableIsolation(SparkWrite.java:421) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.spark.source.SparkWrite$CopyOnWriteOperation.commit(SparkWrite.java:397) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:369) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - ... 57 more [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - Caused by: org.apache.hc.core5.http.NoHttpResponseException: 127.0.0.1:19193 failed to respond [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.core5.http.impl.io.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:301) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:175) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.core5.http.impl.io.HttpRequestExecutor.execute(HttpRequestExecutor.java:218) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.io.PoolingHttpClientConnectionManager$InternalConnectionEndpoint.execute(PoolingHttpClientConnectionManager.java:712) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.InternalExecRuntime.execute(InternalExecRuntime.java:216) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.MainClientExec.execute(MainClientExec.java:116) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ConnectExec.execute(ConnectExec.java:188) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ProtocolExec.execute(ProtocolExec.java:192) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.HttpRequestRetryExec.execute(HttpRequestRetryExec.java:96) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ContentCompressionExec.execute(ContentCompressionExec.java:152) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.RedirectExec.execute(RedirectExec.java:115) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.ExecChainElement.execute(ExecChainElement.java:51) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.InternalHttpClient.doExecute(InternalHttpClient.java:170) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.hc.client5.http.impl.classic.CloseableHttpClient.execute(CloseableHttpClient.java:123) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - at org.apache.iceberg.rest.HTTPClient.execute(HTTPClient.java:267) [2025-03-27 20:46:15,166+00:00] [163] {log_processors.py:183} INFO - ... 73 more ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org