RussellSpitzer commented on PR #8001: URL: https://github.com/apache/iceberg/pull/8001#issuecomment-1653695333
Yes this is why we made it concurrent. Previously for very large rewrites we would get timeouts and poor perf because although all the groups would be complete, we had to wait for commits 1 at a timeSent from my iPhoneOn Jul 27, 2023, at 7:03 AM, Xianyang Liu ***@***.***> wrote: If you are using inherit snapshot id the retry just rewrites the metadata.json. I see the inherent snapshot id is only used for adding manifest files. https://github.com/apache/iceberg/blob/5cad65ad1fbbf214675de6a19f068e5672cb25a7/core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java#L284 So if writing manifests is slow and writing metadata is fast, process two takes much much longer I got your point. So the performance could be decreased if the new manifest writing is slow. Previously, I implemented as adding the failed file groups to a failed queue and doing a retry with the committerService in the end. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: ***@***.***> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
