szehon-ho commented on code in PR #9852:
URL: https://github.com/apache/iceberg/pull/9852#discussion_r1538017237
##########
core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java:
##########
@@ -309,65 +304,20 @@ protected enum CommitStatus {
* @return Commit Status of Success, Failure or Unknown
*/
protected CommitStatus checkCommitStatus(String newMetadataLocation,
TableMetadata config) {
- int maxAttempts =
- PropertyUtil.propertyAsInt(
- config.properties(), COMMIT_NUM_STATUS_CHECKS,
COMMIT_NUM_STATUS_CHECKS_DEFAULT);
- long minWaitMs =
- PropertyUtil.propertyAsLong(
- config.properties(),
- COMMIT_STATUS_CHECKS_MIN_WAIT_MS,
- COMMIT_STATUS_CHECKS_MIN_WAIT_MS_DEFAULT);
- long maxWaitMs =
- PropertyUtil.propertyAsLong(
- config.properties(),
- COMMIT_STATUS_CHECKS_MAX_WAIT_MS,
- COMMIT_STATUS_CHECKS_MAX_WAIT_MS_DEFAULT);
- long totalRetryMs =
- PropertyUtil.propertyAsLong(
- config.properties(),
- COMMIT_STATUS_CHECKS_TOTAL_WAIT_MS,
- COMMIT_STATUS_CHECKS_TOTAL_WAIT_MS_DEFAULT);
-
- AtomicReference<CommitStatus> status = new
AtomicReference<>(CommitStatus.UNKNOWN);
-
- Tasks.foreach(newMetadataLocation)
- .retry(maxAttempts)
- .suppressFailureWhenFinished()
- .exponentialBackoff(minWaitMs, maxWaitMs, totalRetryMs, 2.0)
- .onFailure(
- (location, checkException) ->
- LOG.error("Cannot check if commit to {} exists.", tableName(),
checkException))
- .run(
- location -> {
- TableMetadata metadata = refresh();
- String currentMetadataFileLocation =
metadata.metadataFileLocation();
- boolean commitSuccess =
- currentMetadataFileLocation.equals(newMetadataLocation)
- || metadata.previousFiles().stream()
- .anyMatch(log ->
log.file().equals(newMetadataLocation));
- if (commitSuccess) {
- LOG.info(
- "Commit status check: Commit to {} of {} succeeded",
- tableName(),
- newMetadataLocation);
- status.set(CommitStatus.SUCCESS);
- } else {
- LOG.warn(
- "Commit status check: Commit to {} of {} unknown, new
metadata location is not current "
- + "or in history",
- tableName(),
- newMetadataLocation);
- }
- });
-
- if (status.get() == CommitStatus.UNKNOWN) {
- LOG.error(
- "Cannot determine commit state to {}. Failed during checking {}
times. "
- + "Treating commit state as unknown.",
- tableName(),
- maxAttempts);
- }
- return status.get();
+ return checkCommitStatus(
+ tableName(), newMetadataLocation, config.properties(),
this::loadMetadataLocations);
+ }
+
+ protected List<String> loadMetadataLocations() {
+ TableMetadata metadata = refresh();
+ ImmutableList.Builder<String> builder = ImmutableList.builder();
+ return builder
+ .add(metadata.metadataFileLocation())
+ .addAll(
+ metadata.previousFiles().stream()
Review Comment:
HI, I checked with @RussellSpitzer who authored this originally for Hive.
It's a real issue for Hive tables, because frequent commits lead to situations
where we lose contact with HMS or HMS goes down but the commit goes through.
Then it takes awhile to reconnect, and in the meantime there's other commits.
It seems it was frequent for Hive tables, especially where we commit quite
often (streaming for example).
As this mechanism is not in ViewMetadata, let's forget it then, given that
we dont anticipate too many updates of View definition, especially concurrent
updates, as Tables had.
let's revert your latest change then to check using version. It is not
correct, version is not supposed to be incremented for every update for view in
my understanding, just when the view representation changes?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]