jerqi commented on code in PR #14457:
URL: https://github.com/apache/iceberg/pull/14457#discussion_r2488546498
##########
spark/v4.0/spark/src/main/java/org/apache/iceberg/spark/SparkCatalog.java:
##########
@@ -234,6 +234,15 @@ public Table loadTable(Identifier ident, long timestamp)
throws NoSuchTableExcep
}
}
+ @Override
+ public boolean tableExists(Identifier ident) {
+ if (isPathIdentifier(ident)) {
+ return tables.exists(((PathIdentifier) ident).location());
Review Comment:
It depends on the existence semantics. Do we need to consider the snapshot,
change log tables?
There are two solution ways:
1. For snapshot, change log tables, we return false directly. Current
implementation follows this way.
2. For snapshot, change log tables, we parsed them and verify the existence.
I have a previous commit
```
public boolean tableExists(Identifier ident) {
try {
if (isPathIdentifier(ident)) {
loadFromPathIdentifier((PathIdentifier) ident);
return true;
} else {
boolean isExists =
icebergCatalog.tableExists(buildIdentifier(ident));
if (isExists) {
return true;
}
if (ident.namespace().length == 0) {
return false;
}
// if the original load didn't work, try using the namespace as an
identifier because
// the original identifier may include a snapshot selector or may
point to the changelog
TableIdentifier namespaceAsIdent =
buildIdentifier(namespaceToIdentifier(ident.namespace()));
Matcher tag = TAG.matcher(ident.name());
if (tag.matches()) {
org.apache.iceberg.Table table =
icebergCatalog.loadTable(namespaceAsIdent);
Snapshot tagSnapshot = table.snapshot(tag.group(1));
return tagSnapshot != null;
}
if (icebergCatalog.tableExists(namespaceAsIdent)) {
if (ident.name().equalsIgnoreCase(SparkChangelogTable.TABLE_NAME))
{
return true;
}
Matcher at = AT_TIMESTAMP.matcher(ident.name());
if (at.matches()) {
return true;
}
Matcher id = SNAPSHOT_ID.matcher(ident.name());
if (id.matches()) {
return true;
}
Matcher branch = BRANCH.matcher(ident.name());
if (branch.matches()) {
return true;
}
if (ident.name().equalsIgnoreCase(REWRITE)) {
return true;
}
}
return false;
}
} catch (org.apache.iceberg.exceptions.NoSuchTableException e) {
return false;
}
}
```
The way 1 is more clear.
They way 2 follows the origin code semantics.
Do u have any suggestion?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]