qqchang2nd commented on issue #12251: URL: https://github.com/apache/iceberg/issues/12251#issuecomment-2658960764
You're right that HadoopTables isn't a catalog - it's a lower-level implementation for managing Iceberg tables directly on HDFS without a catalog. Let me explain our use case: We have our own metadata management platform internally, and during our initial evaluation of Iceberg, we chose HadoopTables for table creation and management without fully investigating the differences between HadoopTables and catalogs. After running in production for a while, we noticed that some customer environments were experiencing slow performance during sql-analyze operations. Investigation revealed this was because tables created via HadoopTables don't support manifest caching. To address this, I modified the Iceberg source code to allow HadoopTables.load() to accept properties, enabling manifest caching support which significantly improves sql-analyze performance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org