findepi commented on PR #10691: URL: https://github.com/apache/iceberg/pull/10691#issuecomment-2225908686
@stevenzwu thanks for your comments! > Curious if you have done any performance testing. echo to another comment. wondering if the default queue size of 10K would affect the throughput for very large tables with regular manifest file sizes (like a few to dozens of MBs) in happy path. No, i haven't. (and for large manifests fixing OOM failures seemed more important than anything else) Also, per https://github.com/apache/iceberg/pull/10691#issuecomment-2225641596, the perf testing will very much depend on how the parallel iterator is consumed. if we're concerned about impact -- there is potential impact since we resume tasks only after queue is emptied by the consumer (pre-existing logic) -- we can change this logic. We can resume tasks even before queue is fully empty. Would that help alleviate concerns? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org