mothukur opened a new issue, #11776:
URL: https://github.com/apache/iceberg/issues/11776

   ### Apache Iceberg version
   
   1.7.1 (latest release)
   
   ### Query engine
   
   None
   
   ### Please describe the bug 🐞
   
   The current GlueCatalog implementation does not allow for the reuse of the 
FileIO object, leading to inefficient usage of manifest cache implemented in 
`ManifestFiles` class.
   
   **Problematic Code**
   The `GlueTableOperations` class creates a new `FileIO` object for each 
instance:
   
https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/aws/src/main/java/org/apache/iceberg/aws/glue/GlueTableOperations.java#L113
   ```
   public FileIO io() {
     if (fileIO == null) {
       fileIO = initializeFileIO(this.tableCatalogProperties, this.hadoopConf);
     }
     return fileIO;
   }
   ```
   
   This prevents the `ManifestFiles` class from using the cache :
   
https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/core/src/main/java/org/apache/iceberg/ManifestFiles.java#L75
   ```
   static ContentCache contentCache(FileIO io) {
       return CONTENT_CACHES.get(
           io,
           fileIO ->
               new ContentCache(
                   cacheDurationMs(fileIO), cacheTotalBytes(fileIO), 
cacheMaxContentLength(fileIO)));
   }
   ```
   **Proposed Solution**
   Add a constructor or method to the `GlueCatalog` class that accepts a 
`FileIO` object or a function that builds a `FileIO` object, similar to 
`JdbcCatalog`:
   
https://github.com/apache/iceberg/blob/apache-iceberg-1.7.1/core/src/main/java/org/apache/iceberg/jdbc/JdbcCatalog.java#L99
   ```
     public JdbcCatalog(
         Function<Map<String, String>, FileIO> ioBuilder,
         Function<Map<String, String>, JdbcClientPool> clientPoolBuilder,
         boolean initializeCatalogTables) {
       this.ioBuilder = ioBuilder;
       this.clientPoolBuilder = clientPoolBuilder;
       this.initializeCatalogTables = initializeCatalogTables;
     }
   ```
   
   
   
   
   ### Willingness to contribute
   
   - [X] I can contribute a fix for this bug independently
   - [X] I would be willing to contribute a fix for this bug with guidance from 
the Iceberg community
   - [ ] I cannot contribute a fix for this bug at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to