[I] Using tables created in Hive Catalog in Hadoop catalog [iceberg]

via GitHub Tue, 28 Jan 2025 13:08:01 -0800


guykhazma opened a new issue, #12125:
URL: https://github.com/apache/iceberg/issues/12125


   ### Feature Request / Improvement
   
   We have a scenario where we have tables created with Hive Catalog and then 
in a separate application we would like to be able to define a Hadoop Catalog 
that shares the same warehouse location so that we can read the tables without 
having to setup a metastore.
   
   This is currently failing because Hadoop Catalog and Hive Catalog are using 
two different code paths for the naming of metadata files.
   Hadoop Catalog is using its [own 
function](https://github.com/apache/iceberg/blob/8e456aeeabd0a40b23864edadd622b45cb44572c/core/src/main/java/org/apache/iceberg/hadoop/HadoopTableOperations.java#L258)
 for getting the metadata path which concatenates the metadata version with the 
file extension resulting in a path like this:
   ```
   v1.metadata.json
   ```
   
   Hive Catalog on the other hand is using a [different 
function](https://github.com/apache/iceberg/blob/8e456aeeabd0a40b23864edadd622b45cb44572c/core/src/main/java/org/apache/iceberg/BaseMetastoreTableOperations.java#L321)
 which concatenates the version with UUID and then the file extension  
resulting in a path like this:
   ```
   00000-f34d4846-3c7a-4967-a6fa-30e8cce6eeac.metadata.json
   ```
   
   I can submit a PR with a potential fix for this but I am not sure if there 
was a reason for having different naming convention between the two catalogs.
   
   ### Query engine
   
   Spark
   
   ### Willingness to contribute
   
   - [ ] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[I] Using tables created in Hive Catalog in Hadoop catalog [iceberg]

Reply via email to