qqchang2nd commented on issue #12251: URL: https://github.com/apache/iceberg/issues/12251#issuecomment-2665307798
In Impala, the catalogd daemon maintains metadata caching. Within catalogd, the classes IcebergHadoopCatalog, IcebergHiveCatalog, and IcebergHadoopTables are wrappers around Iceberg's internal HadoopCatalog, HiveCatalog, and HadoopTables implementations respectively. As referenced in [IMPALA-11658](https://issues.apache.org/jira/browse/IMPALA-11658), manifest caching properties are passed to IcebergHadoopCatalog and IcebergHiveCatalog through Catalog.initialize(). Building on this approach, I modified IcebergHadoopTables to pass manifest caching properties through HadoopTables.load() to achieve the same caching benefits. So yes, we have a long-running process (Impala's catalogd) that uses HadoopTables, and enabling manifest caching in this context provides significant performance improvements for our users' queries. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org