koenvo commented on code in PR #1995: URL: https://github.com/apache/iceberg-python/pull/1995#discussion_r2122696555
########## pyiceberg/io/pyarrow.py: ########## @@ -1643,8 +1646,20 @@ def to_record_batches(self, tasks: Iterable[FileScanTask]) -> Iterator[pa.Record ResolveError: When a required field cannot be found in the file ValueError: When a field type in the file cannot be projected to the schema type """ + from concurrent.futures import ThreadPoolExecutor + deletes_per_file = _read_all_delete_files(self._io, tasks) - return self._record_batches_from_scan_tasks_and_deletes(tasks, deletes_per_file) + + if concurrent_tasks is not None: + with ThreadPoolExecutor(max_workers=concurrent_tasks) as pool: Review Comment: Ah, already had this changed but forgot to push. Only need to make sure I get a pool with the correct max_workers set. Can't just use the regular `get_or_create` as that might have an incorrect number of workers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org