wypoon commented on code in PR #12217: URL: https://github.com/apache/iceberg/pull/12217#discussion_r1962371221
########## docs/docs/spark-configuration.md: ########## @@ -165,6 +165,8 @@ spark.read | vectorization-enabled | As per table property | Overrides this table's read.parquet.vectorization.enabled | | batch-size | As per table property | Overrides this table's read.parquet.vectorization.batch-size | | stream-from-timestamp | (none) | A timestamp in milliseconds to stream from; if before the oldest known ancestor snapshot, the oldest will be used | +| streaming-max-files-per-micro-batch | INT_MAX | Maximum number of files per microbatch | +| streaming-max-rows-per-micro-batch | INT_MAX | Maximum number of rows per microbatch. This number should be greater than the number of records in any data file in the table. The smallest unit that will be streamed is a single file, so if a data file contains more records than this limit, the stream will get stuck at this file.| Review Comment: Yes, it makes sense to add the caveat as a Note or Warning. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org