FANNG1 commented on PR #10475:
URL: https://github.com/apache/gravitino/pull/10475#issuecomment-4219491112

   Thanks for working on this issue! The root cause analysis is accurate — the 
missing `CatalogEnvironment` / `catalogLoader` causes Hive partition metadata 
not being updated.
   
   However, I have some concerns about the current dual-write approach 
(overriding `createTable`/`alterTable`/`dropTable` to write to both Gravitino 
and Paimon):
   
   1. **Consistency risk**: There is no transaction guarantee between the two 
writes. If the second write fails, metadata becomes inconsistent with no 
rollback mechanism. For example, in `dropTable`, Gravitino metadata is deleted 
first — if the subsequent Paimon drop fails, the table becomes an orphan.
   
   2. **Architecture concern**: Gravitino is designed as the single source of 
truth for metadata. DDL operations go through the Gravitino REST API, and 
Gravitino server internally syncs to the underlying catalog (Paimon). 
Dual-writing from the Flink connector bypasses this design.
   
   3. **PR description vs. code mismatch**: The description states the fix is 
for `getTable()`/`tableExists()`, but the actual code overrides 
`createTable`/`alterTable`/`dropTable` instead.
   
   **Suggested direction**: The problem is that `BaseCatalog.getTable()` → 
`toFlinkTable()` returns a plain `CatalogTable` without Paimon-specific 
context. A more targeted fix would be to override `toFlinkTable()` in 
`GravitinoPaimonCatalog` to return a Paimon `DataCatalogTable` with a proper 
`CatalogEnvironment`. This keeps Gravitino as the metadata source of truth 
while providing the context Paimon needs for partition handling.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to