pdames opened a new pull request, #150:
URL: https://github.com/apache/iceberg-python/pull/150

   **Related Issue**: https://github.com/apache/iceberg-python/issues/123
   
   **Tests Run**: 
   `make test`
   `make test-integration`
   
   **Description**:
   This PR fixes a bug where `Catalog` APIs returning a table (e.g. 
`catalog.load_table()`, `catalog.create_table()`, etc.) return a `Table` whose 
identifier has the catalog name prepended, but providing this identifier to 
catalog APIs to locate this table raises a `NoSuchTable` error due to the 
prepended catalog name.
   
   This implementation simply removes the prepended catalog name from the 
identifier whenever it is composed of at least 3 elements and the catalog name 
matches the current catalog implementation in use. 
   
   It currently only removes the catalog name in cases where a table identifier 
refers to the location of an existing table, such as in `catalog.load_table()`, 
the `from_identifier` of `catalog.rename_table()`, `catalog.drop_table()`, and 
`catalog.purge_table()`. 
   
   One side-effect of this implementation decision is illustrated in the 
following code, which fails to load a table with a 2-part namespace whose first 
element is identical to the catalog name:
   ```python
   catalog = RestCatalog("rest", uri=TEST_URI, token=TEST_TOKEN)
   table_identifier = ("rest", "pdames", "table")
   table = catalog.create_table(identifier=table_identifier, ...)  # 
table.identifier string is "rest.rest.pdames.table"
   loaded_table = catalog.load_table(table.identifier)  # loads the created 
table successfully
   catalog.load_table(table_identifier)  # NoSuchTable error due to removing 
the namespace that matches the catalog name 
   ```
   In this case, we could fallback to again trying to load the table with the 
catalog name included after catching the `NoSuchTable` error, but this may  
result in us loading a different table than intended since we don't know if the 
first identifier element is meant to refer to a namespace or a catalog name. 
   
   I'm curious to hear reviewers opinions about how concerning these types of 
side-effects are, potential fixes, and if this calls for further refactoring to 
add the missing context in the form of identifier metadata, type-differentiated 
identifier elements, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to