guykhazma commented on code in PR #12228: URL: https://github.com/apache/iceberg/pull/12228#discussion_r2074165557
########## core/src/main/java/org/apache/iceberg/BaseMetastoreCatalog.java: ########## @@ -71,23 +70,35 @@ public Table loadTable(TableIdentifier identifier) { } @Override - public Table registerTable(TableIdentifier identifier, String metadataFileLocation) { + public Table registerTable( + TableIdentifier identifier, String metadataFileLocation, boolean overwrite) { Preconditions.checkArgument( identifier != null && isValidIdentifier(identifier), "Invalid identifier: %s", identifier); Preconditions.checkArgument( metadataFileLocation != null && !metadataFileLocation.isEmpty(), "Cannot register an empty metadata file location as a table"); - // Throw an exception if this table already exists in the catalog. - if (tableExists(identifier)) { + // If the table already exists and overwriting is disabled, throw an exception. + if (tableExists(identifier) && !overwrite) { throw new AlreadyExistsException("Table already exists: %s", identifier); } TableOperations ops = newTableOps(identifier); - InputFile metadataFile = ops.io().newInputFile(metadataFileLocation); - TableMetadata metadata = TableMetadataParser.read(ops.io(), metadataFile); - ops.commit(null, metadata); - + TableMetadata newMetadata = + TableMetadataParser.read(ops.io(), ops.io().newInputFile(metadataFileLocation)); + + TableMetadata existing = ops.current(); + if (existing != null && overwrite) { + if (existing.metadataFileLocation().equals(metadataFileLocation)) { + LOG.info( + "The requested metadata matches the existing metadata. No changes will be committed."); + return new BaseTable(ops, fullTableName(name(), identifier), metricsReporter()); + } + dropTable(identifier, false /* Keep all data and metadata files */); Review Comment: @dramaticlly I see your point. It seems to me the core question here is whether the responsibility for maintaining the lineage of a table identifier lies with the catalog or the table itself. From my perspective, it makes more sense for the catalog to handle this, especially since the overwrite operation doesn't alter the physical state of the table. Ideally, this reference change should be atomic, but the implementation details can be left to individual catalogs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org