mchades opened a new issue, #10763:
URL: https://github.com/apache/gravitino/issues/10763

   **Version:** main branch
   
   ## Describe what's wrong
   
   PR #10676 introduced a breaking behavioral change in 
`SchemaOperationDispatcher.createSchema()`.
   
   **Before #10676:**
   When `store.put()` fails after the schema is successfully created in the 
underlying catalog, the method logs the error and still **returns the created 
`Schema` object** to the caller. This is intentional soft-degraded behavior — 
the schema exists in the catalog even if Gravitino's store temporarily fails.
   
   **After #10676:**
   When `store.put()` fails, the method now:
   1. Attempts to rollback by calling `dropSchema(ident, true /* cascade */)` 
on the underlying catalog
   2. **Throws `GravitinoRuntimeException`** regardless of rollback outcome
   
   This is a breaking change for two reasons:
   1. **API contract broken**: Any caller of `createSchema()` that previously 
received a successful response during transient store failures will now receive 
an exception.
   2. **Data loss risk**: The rollback uses `cascade=true`, which means all 
data/objects inside the schema in the underlying catalog will also be deleted — 
even if the store failure was transient (e.g., a momentary network blip). This 
can cause irreversible data loss.
   
   ## Error message and/or stacktrace
   
   ```
   GravitinoRuntimeException: Failed to persist schema metadata to Gravitino 
store for: <ident>.
     Schema creation in underlying catalog has been rolled back.
   ```
   
   ## How to reproduce
   
   1. Create a schema via `GravitinoClient` while the Gravitino entity store is 
temporarily unavailable
   2. Before #10676: schema creation returns successfully (schema exists in 
underlying catalog)
   3. After #10676: `GravitinoRuntimeException` is thrown AND the schema is 
dropped (with cascade) from the underlying catalog
   
   ## Additional context
   
   The original behavior (returning the schema even on store failure) was the 
correct design — it prioritizes availability with the underlying catalog over 
Gravitino's metadata store consistency. The rollback strategy with 
`cascade=true` is too destructive for a transient failure scenario.
   
   Revert commit: `2d8683aef6df0c120ca16f15b53ab9a7ae2beb28`
   Related PR: #10676


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to