gaborkaszab commented on code in PR #14868:
URL: https://github.com/apache/iceberg/pull/14868#discussion_r2631033431
##########
core/src/test/java/org/apache/iceberg/view/ViewCatalogTests.java:
##########
@@ -1968,4 +1970,87 @@ public void dropNonEmptyNamespace() {
.as("Namespace should not exist")
.isFalse();
}
+
+ @Test
+ public void registerView() {
+ C catalog = catalog();
+
+ // Register view is not yet supported for REST catalog
+ assumeThat(catalog instanceof RESTCatalog).isFalse();
+
+ TableIdentifier identifier = TableIdentifier.of("ns", "view");
+
+ if (requiresNamespaceCreate()) {
+ catalog.createNamespace(identifier.namespace());
+ }
+
+ View originalView =
+ catalog()
+ .buildView(identifier)
+ .withSchema(SCHEMA)
+ .withDefaultNamespace(identifier.namespace())
+ .withDefaultCatalog(catalog().name())
+ .withQuery("spark", "select * from ns.tbl")
+ .withProperty(GC_ENABLED, "false")
+ .create();
+
+ ViewOperations ops = ((BaseView) originalView).operations();
+ String metadataLocation = ops.current().metadataFileLocation();
+
+ assertThat(catalog.dropView(identifier)).isTrue();
+ assertThat(catalog.viewExists(identifier)).as("View must not
exist").isFalse();
+
+ // view metadata should still exist after dropping the view as gc is
disabled
+ assertThat(((BaseViewOperations)
ops).io().newInputFile(metadataLocation).exists()).isTrue();
+
+ View registeredView = catalog.registerView(identifier, metadataLocation);
+
+ assertThat(registeredView).isNotNull();
+ assertThat(catalog.viewExists(identifier)).as("View must exist").isTrue();
+ assertThat(registeredView.schema().asStruct())
+ .as("Schema must match")
+ .isEqualTo(originalView.schema().asStruct());
+ assertThat(registeredView.currentVersion())
+ .as("Current version must match")
+ .isEqualTo(originalView.currentVersion());
+ assertThat(registeredView.versions())
Review Comment:
In general the purpose of default catalog seems more general that Spark
(even if potentially Spark uses it only for now). If the engine uses
`catalog.namespace.table` naming and the view contains `SELECT * FROM tbl;`
then the engine should know which catalog and namespace to use. Such an engine
will know that `defaultCatalog.defaultNamespace.tbl` is the one used in the
query.
However, when we move the table to a different catalog, won't it be an issue
that the view metadata contains the name of the source catalog and not the new
one?
Let's say you configure your engine like this:
1) One catalog has the name`catalog1` and it has namespace `ns` and a table
`ns.tbl`
2) Other catalog has the name `catalog2` and it also has a namespace `ns`
and a table `ns.tbl`
3) The content of `catalog1.ns.tbl` and `catalog2.ns.tbl` is NOT the same
Now let's say you create a view in `catalog1` as `SELECT * FROM tbl`. The
default catalog here would be `catalog1`.
When you register this into `catalog2`, the default catalog will remain
`catalog1`, so when querying `catalog2.ns.registered_view` it would still query
`catalog1.ns.tbl`. However, I'd think the intention would be different: the
view in `catalog2` should be querying `catalog2.ns.tbl`.
I'm not sure how much this is intended by design, but would be nice to hear
some thoughts before moving on. I think this defaultCatalog design makes sense
for resolving the full table name (when the engine has a catalog layer in
naming), however, could cause unintended side-effects when registering views
across-catalogs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]