amogh-jahagirdar opened a new pull request, #6480: URL: https://github.com/apache/iceberg/pull/6480
Fixing error handling for https://github.com/apache/iceberg/issues/6388. Based on the stack trace the following sequence of events seems plausible. 1.) [The snapshot ID for current offset no longer exists (my hunch is due to expiration)](https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java#L210). So table.snapshot(currentOffset.snapshotId()) returns null. 2.) Then planning throws an unclear [NPE](https://github.com/apache/iceberg/blob/master/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java#L229) here when trying to get the operation associated with the snapshot. In this approach, planning fails altogether since for streaming it's required that there is a known chain of snapshot. Although would appreciate feedback from folks more familiar with Spark @RussellSpitzer @aokolnychyi @singhpk234 @rajarshisarkar . Not sure if we can safely just skip the snapshot since a consumer of the stream technically is not consuming the original state of the table. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org