potatochipcoconut opened a new issue, #1984: URL: https://github.com/apache/iceberg-python/issues/1984
### Question Hello pyicebergers, I am new to iceberg/s3 tables and am experimenting with using it as part of an IDP pipeline where we would store ocr data in s3 tables, that requires supporting high throughput and concurrency. I have a poc pipeline set up that works, but the writes are too slow using basic implementation. Does anyone know how or what could be done to improve the write performance? I've been reading through various articles/issues/prs etc but not sure which approach to try? Thank you https://github.com/apache/iceberg-python/issues/1751 (could this be useful?) Additionally I read how setting `PYICEBERG_MAX_WORKERS` could help with concurrency, but I could not find any reference to it in the pyiceberg code? How does that setting get consumed/used? https://github.com/apache/iceberg-python/pull/444 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org