amogh-jahagirdar commented on issue #8953: URL: https://github.com/apache/iceberg/issues/8953#issuecomment-1846628756
Ok I actually looked at the history of these changes now https://github.com/apache/iceberg/pull/5214 was never merged but followed by https://github.com/apache/iceberg/pull/6569/files which actually applied the change and would've been released in 1.2.0. The goal for including the query ID looks to be to identify which spark job actually performed the write; previously there would've been a new UUID per write, and we would've avoided files stepping on each other. Let me try and get a reproducible example, (we would want one anyways for verifying whatever fix we do actually works) ideally we can get the best of both worlds. I think to do that some combination of the query ID + the hostname + the thread ID would be truly unique and enable better debugging (at the cost of a really long filename :) ). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org