RussellSpitzer commented on code in PR #6376: URL: https://github.com/apache/iceberg/pull/6376#discussion_r1042333590
########## docs/spark-procedures.md: ########## @@ -493,6 +493,37 @@ CALL spark_catalog.system.add_files( ) ``` +### `register_table` + +Creates a catalog entry for a metadata.json file which already exists but does not have a corresponding catalog identifier. + +#### Usage + +| Argument Name | Required? | Type | Description | +|---------------|-----------|------|-------------| +| `table` | ✔️ | string | Table which is to be registered | +| `metadata_file`| ✔️ | string | Metadata file which is to be registered as a new catalog identifier | + +Warning: If we register tables which exist in another catalog to the current catalog, then the tables would exist in both the catalogs. And using same table from multiple catalogs is not recommended as it fails to keep the table metadata up to date. Review Comment: I think this has to be much stronger. Having a table registered in two catalogs is essentially a Split Brain issue. It wont' just have problems with metadata being kept up to date, it will also potentially lose data. I would say something along "Warning: Having the same metadata.json registered in more than one catalog can lead to missing updates, loss of data, and table corruption. Only use this procedure when the table is no longer registered in an existing catalog, or you are moving a table between catalogs." -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org