namrathamyske opened a new issue, #7885: URL: https://github.com/apache/iceberg/issues/7885
### Apache Iceberg version main (development) ### Query engine Spark ### Please describe the bug 🐞 In rate limiting for structured streaming PR - https://github.com/apache/iceberg/pull/4479. According to https://github.com/apache/iceberg/pull/4479/files#diff-26782bf5c27f69e5cc9cd4a9363f601a97d1c9f97fe0c1a7fb927da7c60c014fR169 unit test, it says the stream get stuck if SparkReadOptions.STREAMING_MAX_ROWS_PER_MICRO_BATCH is not respected. It's is a major blocker to consume this feature. If the stream is stuck, then no further advancement of stream takes place even if new snapshots comes in. E.g.: STREAMING_MAX_ROWS_PER_MICRO_BATCH - 2 Snapshot1 - (2 records, 1 file) - Read fully in Microbatch-1 Snapshot2 - (3 records, 1 file) - Can never be read as 3 records > STREAMING_MAX_ROWS_PER_MICRO_BATCH ( Stuck forever ) Snapshot3 - 3 records Please let me know if this is intended behavior or is it expected to change. @singhpk234 @jackye1995 @RussellSpitzer @rdblue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
