laserninja commented on PR #10671: URL: https://github.com/apache/gravitino/pull/10671#issuecomment-4276709387
Thanks for checking, @roryqi! **1. Backend considerations & how other Iceberg REST catalogs handle pagination:** The Iceberg Catalog Java API (`listNamespaces()`, `listTables()`, `listViews()`) returns complete `List` results with no built-in pagination support — this is true regardless of the backend (Hive, JDBC, etc.). The Hive metastore APIs (`getAllDatabases`, `getAllTables`) also return all results at once without native pagination. So in-memory pagination at the REST layer is the only practical approach here. This is consistent with how other Iceberg REST catalogs handle it: - **Polaris** went through an extensive pagination effort (PRs [#1528](https://github.com/apache/polaris/pull/1528), [#1938](https://github.com/apache/polaris/pull/1938)). Their initial implementation also did in-memory pagination at the REST/service layer. They later worked on pushing pagination down to the persistence layer, but that was for Polaris's own entity management — the Iceberg catalog operations themselves (via `CatalogHandlers.listNamespaces`) still use in-memory listing. They also noted this is a limitation when federating to external catalogs. - **Lakekeeper** and **UnityCatalog** similarly handle `pageToken`/`pageSize` at the service layer. Since Gravitino's `IcebergCatalogWrapper` wraps the standard Iceberg `Catalog` interface, in-memory pagination is the appropriate approach at this layer. If backend-native pagination is desired in the future, it would require changes to the Iceberg Catalog interface itself. **2.** Will resolve the conflicts. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
