huaxingao commented on code in PR #10943: URL: https://github.com/apache/iceberg/pull/10943#discussion_r1720099440
########## parquet/src/main/java/org/apache/iceberg/parquet/VectorizedParquetReader.java: ########## @@ -141,8 +148,15 @@ public T next() { advance(); } + long remainingValues = nextRowGroupStart - valuesRead; + int remainingLimit = (int) (pushedLimit - valuesRead); Review Comment: Typically, limit will not be a large number, but I changed `remainingLimit` to a long to make it consistent with `remainingValues`. ########## parquet/src/main/java/org/apache/iceberg/parquet/VectorizedParquetReader.java: ########## @@ -141,8 +148,15 @@ public T next() { advance(); } + long remainingValues = nextRowGroupStart - valuesRead; + int remainingLimit = (int) (pushedLimit - valuesRead); // batchSize is an integer, so casting to integer is safe - int numValuesToRead = (int) Math.min(nextRowGroupStart - valuesRead, batchSize); + int numValuesToRead = + (int) + Math.min( + remainingValues, + (remainingLimit > 0 ? Math.min(batchSize, remainingLimit) : batchSize)); Review Comment: I do agree using if/else is clearer, so I have changed the code as you suggested. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org