zhongyujiang commented on PR #10567: URL: https://github.com/apache/iceberg/pull/10567#issuecomment-2243074141
> Edit: Would it be possible to create an e2e like unit test to simulate the issue? It might be easier to understand the issue, or debug. Unfortunately, I am unsure how to create an e2e test, as I do not know how to control the timing of the checkpoint. We encountered this issue in a very specific scenario where the job starts and runs for a while before restarting (due to other reasons), and then the job cannot correctly recover from the ckpt due to this issue. If you want to debug, could you review the unit test I have provided? Although it is not an e2e test, this unit test reproduces the situation I described. Please set a breakpoint at the `updateCurrentIterator` within the `seek` method. You will observe that at the end of `updateCurrentIterator`, `currentIterator` will point to the second FileScanTask in the unit test, so `fileOffset` should be 1. However, `fileOffset` will be assigned to 0 without this PR. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org