mchades opened a new issue, #10763:
URL: https://github.com/apache/gravitino/issues/10763
**Version:** main branch
## Describe what's wrong
PR #10676 introduced a breaking behavioral change in
`SchemaOperationDispatcher.createSchema()`.
**Before #10676:**
When `store.put()` fails after the schema is successfully created in the
underlying catalog, the method logs the error and still **returns the created
`Schema` object** to the caller. This is intentional soft-degraded behavior —
the schema exists in the catalog even if Gravitino's store temporarily fails.
**After #10676:**
When `store.put()` fails, the method now:
1. Attempts to rollback by calling `dropSchema(ident, true /* cascade */)`
on the underlying catalog
2. **Throws `GravitinoRuntimeException`** regardless of rollback outcome
This is a breaking change for two reasons:
1. **API contract broken**: Any caller of `createSchema()` that previously
received a successful response during transient store failures will now receive
an exception.
2. **Data loss risk**: The rollback uses `cascade=true`, which means all
data/objects inside the schema in the underlying catalog will also be deleted —
even if the store failure was transient (e.g., a momentary network blip). This
can cause irreversible data loss.
## Error message and/or stacktrace
```
GravitinoRuntimeException: Failed to persist schema metadata to Gravitino
store for: <ident>.
Schema creation in underlying catalog has been rolled back.
```
## How to reproduce
1. Create a schema via `GravitinoClient` while the Gravitino entity store is
temporarily unavailable
2. Before #10676: schema creation returns successfully (schema exists in
underlying catalog)
3. After #10676: `GravitinoRuntimeException` is thrown AND the schema is
dropped (with cascade) from the underlying catalog
## Additional context
The original behavior (returning the schema even on store failure) was the
correct design — it prioritizes availability with the underlying catalog over
Gravitino's metadata store consistency. The rollback strategy with
`cascade=true` is too destructive for a transient failure scenario.
Revert commit: `2d8683aef6df0c120ca16f15b53ab9a7ae2beb28`
Related PR: #10676
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]