DieHertz commented on issue #1229: URL: https://github.com/apache/iceberg-python/issues/1229#issuecomment-2429473805
Here I have extracted the code returning `list[dict]` of entries for each `Manifest` and run it inside the `ThreadPoolExecutor` provided by the `pyiceberg.utils.concurrent.ExecutorFactory`: https://github.com/DieHertz/iceberg-python/commit/34c28457191ca9225417828e4bdafee22d1e088b No matter the `max_workers` value: 1, 4, unlimited, it takes the same time to process. **py-spy** struggles to sample multiple threads at any usable rate on my machine, so I'm not providing a flame graph for now :-( -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org