diqiu50 opened a new issue, #10739:
URL: https://github.com/apache/gravitino/issues/10739

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   Several cache invalidation paths invalidate cache entries before the backend 
write is finished.
   
   This creates a race window under concurrent access:
   
   1. a write operation starts, such as `alterCatalog`, `dropCatalog`, 
`update`, `delete`, `insertRelation`, or `updateEntityRelations`
   2. the cache entry is invalidated first
   3. another thread reads the same object while the backend mutation has not 
completed yet
   4. that read misses the cache, fetches old data from the backend, and writes 
the stale result back into cache
   
   After that, the cache may continue serving outdated metadata even though the 
write has already succeeded.
   
   This can cause incorrect behavior such as:
   - loading a catalog by its old name after rename
   - catalog existence checks still returning true after drop
   - stale relation results after relation insert or update
   
   ### Error message and/or stacktrace
   
   This is a race-condition issue and does not always produce a stable 
stacktrace.
   
   Typical symptoms are stale reads after successful writes, for example:
   - the old catalog identifier is still loadable after rename
   - a dropped catalog is still reported as existing
   - relation queries return outdated results after relation updates
   
   ### How to reproduce
   
   1. Use `main branch`.
   2. Trigger a write operation that updates cached metadata, such as catalog 
rename, catalog drop, relation insert, or relation update.
   3. At the same time, trigger another thread to read the same catalog or 
relation data.
   4. Observe that the read can repopulate the cache with pre-update data 
before the backend mutation finishes.
   5. After the write completes, stale data may still be returned from cache.
   
   ### Additional context
   
   The issue is caused by invalidating cache before backend mutation in several 
catalog and relational entity store code paths.
   
   A fix should move cache invalidation to after the backend mutation succeeds, 
so concurrent reads either see the still-valid old cache before commit or 
reload fresh data after commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to