vladislav-sidorovich opened a new issue, #13359: URL: https://github.com/apache/iceberg/issues/13359
### Apache Iceberg version 1.9.0 ### Query engine Spark ### Please describe the bug 🐞 Hello, I have the following code (simplified) Spark code to read Delta Lake and write as Apache Iceberg: ``` Dataset<Row> df = spark.read() .format("delta") .load(deltaSource); df.write() .format("iceberg") .mode(SaveMode.Overwrite) .saveAsTable(icebergTableName); ``` It throws an exception: `Cannot work with a non-pinned table snapshot of the TahoeFileIndex` SQL way throws the same exception: ``` CREATE TABLE iceberg_cat.analytics.sales_data_iceberg USING iceberg AS SELECT * FROM delta_cat.staging.sales_data; ``` Note: It can be Delta Lake issue, but other format work. For example, if I save `df` into other formats like`csv` or `parquet` it works. Works: ``` df.write() .format("csv") // parquet works as well .mode(SaveMode.Overwrite) .save(icebergTableName); ``` Stacktrace: ``` [11:47:17.880] ERROR Utils - Aborting task java.lang.IllegalArgumentException: requirement failed: Cannot work with a non-pinned table snapshot of the TahoeFileIndex at scala.Predef$.require(Predef.scala:337) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.delta.ScanWithDeletionVectors$.dvEnabledScanFor(PreprocessTableWithDVs.scala:79) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.ScanWithDeletionVectors$.unapply(PreprocessTableWithDVs.scala:66) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVs$$anonfun$preprocessTablesWithDVs$1.applyOrElse(PreprocessTableWithDVs.scala:56) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVs$$anonfun$preprocessTablesWithDVs$1.applyOrElse(PreprocessTableWithDVs.scala:55) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) ~[spark-sql-api_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:405) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.delta.SubqueryTransformerHelper.transformWithSubqueries(SubqueryTransformerHelper.scala:43) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.SubqueryTransformerHelper.transformWithSubqueries$(SubqueryTransformerHelper.scala:40) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVsStrategy.transformWithSubqueries(PreprocessTableWithDVsStrategy.scala:33) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVs.preprocessTablesWithDVs(PreprocessTableWithDVs.scala:55) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVs.preprocessTablesWithDVs$(PreprocessTableWithDVs.scala:54) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVsStrategy.preprocessTablesWithDVs(PreprocessTableWithDVsStrategy.scala:33) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.delta.PreprocessTableWithDVsStrategy.apply(PreprocessTableWithDVsStrategy.scala:39) ~[delta-spark_2.13-3.2.0.jar:3.2.0] at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$1(QueryPlanner.scala:63) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:594) ~[scala-library-2.13.15.jar:?] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:608) ~[scala-library-2.13.15.jar:?] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:601) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:70) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:727) ~[scala-library-2.13.15.jar:?] at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:721) ~[scala-library-2.13.15.jar:?] at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1303) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:594) ~[scala-library-2.13.15.jar:?] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:608) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:70) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$3(QueryPlanner.scala:78) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at scala.collection.IterableOnceOps.foldLeft(IterableOnce.scala:727) ~[scala-library-2.13.15.jar:?] at scala.collection.IterableOnceOps.foldLeft$(IterableOnce.scala:721) ~[scala-library-2.13.15.jar:?] at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1303) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.catalyst.planning.QueryPlanner.$anonfun$plan$2(QueryPlanner.scala:75) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at scala.collection.Iterator$$anon$10.nextCur(Iterator.scala:594) ~[scala-library-2.13.15.jar:?] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:608) ~[scala-library-2.13.15.jar:?] at org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SparkStrategies.plan(SparkStrategies.scala:70) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$.createSparkPlan(QueryExecution.scala:496) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$sparkPlan$1(QueryExecution.scala:171) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:219) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:546) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:219) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:218) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:171) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:164) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$executedPlan$1(QueryExecution.scala:186) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:138) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:219) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:546) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:219) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:218) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:186) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:179) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:120) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) ~[spark-sql-api_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.$anonfun$writeToTable$1(WriteToDataSourceV2Exec.scala:577) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1397) ~[spark-core_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable(WriteToDataSourceV2Exec.scala:573) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.V2CreateTableAsSelectBaseExec.writeToTable$(WriteToDataSourceV2Exec.scala:567) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.writeToTable(WriteToDataSourceV2Exec.scala:183) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.AtomicReplaceTableAsSelectExec.run(WriteToDataSourceV2Exec.scala:216) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49) ~[spark-sql_2.13-3.5.1.jar:?] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98) ~[spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461) ~[spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76) [spark-sql-api_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437) [spark-catalyst_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:142) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:859) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:634) [spark-sql_2.13-3.5.1.jar:3.5.1] at org.apache.spark.sql.DataFrameWriter.saveAsTable(DataFrameWriter.scala:564) [spark-sql_2.13-3.5.1.jar:3.5.1] ``` ### Willingness to contribute - [ ] I can contribute a fix for this bug independently - [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org