rjayapalan opened a new issue, #9689: URL: https://github.com/apache/iceberg/issues/9689
### Apache Iceberg version 1.4.2 ### Query engine Spark ### Please describe the bug 🐞 I am aware of this similar issue that was addressed as part of iceberg 1.4.1 release https://github.com/apache/iceberg/pull/8834 But this one seems different to me. This error comes up after performing a DDL change on the existing iceberg table (adding a new column) and then performing "rewrite_manifest" maintenance operation using Spark stored procedure. I was able to recreate the issue following this pattern (ALTER TABLE... -> rewrite_manifest). Not sure what is causing this or is this a bug in the first place? Error stacktrace: `An error was encountered: An error occurred while calling o431.parquet. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 432.0 failed 4 times, most recent failure: Lost task 3.3 in stage 432.0 (TID 23496) ([2600:1f18:610f:a400:cea3:e3ca:45b8:9398] executor 409): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to s3://cs-dataeng-staging/rjayapalan/tmp/crm_dev_unload. at org.apache.spark.sql.errors.QueryExecutionErrors$.taskFailedWhileWritingRowsError(QueryExecutionErrors.scala:789) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:421) at org.apache.spark.sql.execution.datasources.WriteFilesExec.$anonfun$doExecuteWrite$1(WriteFiles.scala:100) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:888) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:888) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:364) at org.apache.spark.rdd.RDD.iterator(RDD.scala:328) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92) at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161) at org.apache.spark.scheduler.Task.run(Task.scala:141) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:563) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1541) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:566) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) Caused by: java.lang.IllegalArgumentException: requirement failed: length (-6235972) cannot be smaller than -1 at scala.Predef$.require(Predef.scala:281) at org.apache.spark.rdd.InputFileBlockHolder$.set(InputFileBlockHolder.scala:79) at org.apache.spark.rdd.InputFileBlockHolder.set(InputFileBlockHolder.scala) at org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:93) at org.apache.iceberg.spark.source.RowDataReader.open(RowDataReader.java:43) at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:141) at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120) at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158) at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63) at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63) at scala.Option.exists(Option.scala:376) at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460) at org.apache.spark.sql.execution.datasources.FileFormatDataWriter.writeWithIterator(FileFormatDataWriter.scala:91) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:404) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1575) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:411) ... 15 more ` Environment : EMR 6.15 || Spark 3.4 || Iceberg 1.4.2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org