diqiu50 opened a new issue, #10741:
URL: https://github.com/apache/gravitino/issues/10741

   ### Version
   
   main branch
   
   ### Describe what's wrong
   
   When multiple threads concurrently call `FetchFileUtils.fetchFileFromUri()` 
with a `file:` URI pointing to the same destination path, a 
`FileAlreadyExistsException` is thrown.
   
   The root cause is a TOCTOU (time-of-check-time-of-use) race condition in the 
`file:` scheme handler. It calls `Files.createSymbolicLink()` directly without 
first checking whether the destination already exists. Under concurrent access 
— such as when the search event listener triggers cascade metadata sync after 
catalog creation — multiple `HiveClientPool` instances with different user 
identities share the same keytab destination path, and the concurrent symlink 
creation attempts fail with `FileAlreadyExistsException`.
   
   A related issue exists in `KerberosClient.saveKeyTabFileFromUri()`, which 
uses a non-atomic `exists() + delete()` pattern before calling 
`fetchFileFromUri()`.
   
   ### Error message and/or stacktrace
   
   ```text
   java.nio.file.FileAlreadyExistsException: <keytab-dest-path>
       at 
java.base/sun.nio.fs.UnixFileSystemProvider.createSymbolicLink(UnixFileSystemProvider.java:yyy)
       at java.base/java.nio.file.Files.createSymbolicLink(Files.java:yyy)
       at 
org.apache.gravitino.hive.kerberos.FetchFileUtils.fetchFileFromUri(FetchFileUtils.java:xx)
   ```
   
   ### How to reproduce
   
   1. Use `main branch`.
   2. Create a Hive catalog with Kerberos authentication enabled.
   3. Enable the search event listener in `ASYNC_ISOLATED` mode.
   4. Trigger a `createCatalog` operation.
   5. Observe that concurrent cascade sync threads sharing the same keytab path 
may throw `FileAlreadyExistsException`.
   
   ### Additional context
   
   The fix should:
   - synchronize symlink creation on the destination path to avoid concurrent 
races
   - replace the non-atomic `exists() + delete()` in `KerberosClient` with 
`Files.deleteIfExists()`
   - consolidate tests into `hive-metastore-common` and add 
concurrent/idempotent coverage


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to