[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

GitBox Wed, 07 Dec 2022 07:17:02 -0800


RussellSpitzer commented on code in PR #6376:
URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042333590



##########
docs/spark-procedures.md:
##########
@@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files(
 )
 ```
 
+### `register_table`
+
+Creates a catalog entry for a metadata.json file which already exists but does 
not have a corresponding catalog identifier.
+
+#### Usage
+
+| Argument Name | Required? | Type | Description |
+|---------------|-----------|------|-------------|
+| `table`       | ✔️  | string | Table which is to be registered |
+| `metadata_file`| ✔️  | string | Metadata file which is to be registered as a 
new catalog identifier |
+
+Warning: If we register tables which exist in another catalog to the current 
catalog, then the tables would exist in both the catalogs. And using same table 
from multiple catalogs is not recommended as it fails to keep the table 
metadata up to date. 

Review Comment:
   I think this has to be much stronger. Having a table registered in two 
catalogs is essentially a Split Brain issue. It wont' just have problems with 
metadata being kept up to date, it will also potentially lose data.
   
   I would say something along
   
   "Warning: Having the same metadata.json registered in more than one catalog 
can lead to missing updates, loss of data, and table corruption. Only use this 
procedure when the table is no longer registered in an existing catalog, or you 
are moving a table between catalogs."



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #6376: Docs: Add register table Spark procedure documentation

Reply via email to