diqiu50 opened a new issue, #10741:
URL: https://github.com/apache/gravitino/issues/10741
### Version
main branch
### Describe what's wrong
When multiple threads concurrently call `FetchFileUtils.fetchFileFromUri()`
with a `file:` URI pointing to the same destination path, a
`FileAlreadyExistsException` is thrown.
The root cause is a TOCTOU (time-of-check-time-of-use) race condition in the
`file:` scheme handler. It calls `Files.createSymbolicLink()` directly without
first checking whether the destination already exists. Under concurrent access
— such as when the search event listener triggers cascade metadata sync after
catalog creation — multiple `HiveClientPool` instances with different user
identities share the same keytab destination path, and the concurrent symlink
creation attempts fail with `FileAlreadyExistsException`.
A related issue exists in `KerberosClient.saveKeyTabFileFromUri()`, which
uses a non-atomic `exists() + delete()` pattern before calling
`fetchFileFromUri()`.
### Error message and/or stacktrace
```text
java.nio.file.FileAlreadyExistsException: <keytab-dest-path>
at
java.base/sun.nio.fs.UnixFileSystemProvider.createSymbolicLink(UnixFileSystemProvider.java:yyy)
at java.base/java.nio.file.Files.createSymbolicLink(Files.java:yyy)
at
org.apache.gravitino.hive.kerberos.FetchFileUtils.fetchFileFromUri(FetchFileUtils.java:xx)
```
### How to reproduce
1. Use `main branch`.
2. Create a Hive catalog with Kerberos authentication enabled.
3. Enable the search event listener in `ASYNC_ISOLATED` mode.
4. Trigger a `createCatalog` operation.
5. Observe that concurrent cascade sync threads sharing the same keytab path
may throw `FileAlreadyExistsException`.
### Additional context
The fix should:
- synchronize symlink creation on the destination path to avoid concurrent
races
- replace the non-atomic `exists() + delete()` in `KerberosClient` with
`Files.deleteIfExists()`
- consolidate tests into `hive-metastore-common` and add
concurrent/idempotent coverage
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]