lirui-apache opened a new issue, #11866: URL: https://github.com/apache/iceberg/issues/11866
### Apache Iceberg version 1.4.3 ### Query engine Spark ### Please describe the bug 🐞 We are using `NoLock` for committing, and we recently hit an issue when HiveTableOperations considered a successful commit as concurrent modification and cleaned up the metadata JSON file that had already been committed, leaving the table in an unusable state. We configured HMS HA with 3 instances. By examining the logs of these HMS instances, we found they experienced high workload at the time of the issue. And we found alter table requests from the committing job on 2 instances. So I believe the issue happened like this: 1. The 1st alter table request succeeded but the HMS instance failed to deliver a successful response. 2. We failed over to another instance, and since the metadata location has been changed, the HMS instance returned an exception containing message like `"The table has been modified. ..."` 3. HiveTableOperations checked the exception message, determined this should be a `CommitFailedException` and deleted the metadata JSON file it created. There might be two ways to fix the issue: 1. We don't configure HMS HA, or use RetryingMetaStoreClient for committing. So that concurrent modification exceptions from HMS are more reliable. But then we may need to do retries on thrift exceptions by ourselves. 2. We do a `checkCommitStatus` for the concurrent modification exceptions, to make sure we really failed. This is simpler but I believe it brings in an extra refresh from HMS. ### Willingness to contribute - [X] I can contribute a fix for this bug independently - [X] I would be willing to contribute a fix for this bug with guidance from the Iceberg community - [ ] I cannot contribute a fix for this bug at this time -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org