Copilot commented on code in PR #10742:
URL: https://github.com/apache/gravitino/pull/10742#discussion_r3063985395


##########
catalogs/hive-metastore-common/src/main/java/org/apache/gravitino/hive/kerberos/FetchFileUtils.java:
##########
@@ -45,7 +47,12 @@ public static void fetchFileFromUri(
           break;
 
         case "file":
-          Files.createSymbolicLink(destFile.toPath(), new 
File(uri.getPath()).toPath());
+          // Synchronize on the canonical dest path to prevent concurrent 
threads from
+          // racing between deleteIfExists and createSymbolicLink for the same 
keytab file.
+          synchronized (destFile.getAbsolutePath().intern()) {
+            Files.deleteIfExists(destFile.toPath());
+            Files.createSymbolicLink(destFile.toPath(), new 
File(uri.getPath()).toPath());
+          }

Review Comment:
   The current implementation always deletes destFile before creating the 
symlink. This creates a brief window where the keytab path does not exist; if 
another thread/process tries to read the keytab concurrently (e.g., during 
UserGroupInformation.loginUserFromKeytab), it can intermittently fail with 
FileNotFound. Consider making the operation idempotent by first checking 
whether an existing symlink already points to the desired target, and only 
replacing when necessary; when replacing, prefer an atomic replace strategy 
(e.g., create a temp symlink and move/rename with REPLACE_EXISTING) to avoid a 
missing-file window.



##########
catalogs/hive-metastore-common/src/test/java/org/apache/gravitino/hive/kerberos/TestFetchFileUtils.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.hive.kerberos;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.CyclicBarrier;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import org.apache.hadoop.conf.Configuration;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Disabled;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TestFetchFileUtils {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TestFetchFileUtils.class);
+  private static final int MAX_RETRIES = 8;
+  private static final long INITIAL_RETRY_DELAY_MS = 1000;
+
+  @TempDir File tempDir;
+
+  @Test
+  public void testLinkLocalFile() throws Exception {
+    File srcFile = new File(tempDir, "source");
+    Assertions.assertTrue(srcFile.createNewFile());
+    File destFile = new File(tempDir, "dest");
+
+    FetchFileUtils.fetchFileFromUri(srcFile.toURI().toString(), destFile, 10, 
new Configuration());
+    Assertions.assertTrue(Files.isSymbolicLink(destFile.toPath()));
+    Assertions.assertEquals(srcFile.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+  }
+
+  @Test
+  public void testConcurrentSymlinkCreation() throws Exception {
+    File srcFile = new File(tempDir, "source_concurrent");
+    Assertions.assertTrue(srcFile.createNewFile());
+    File destFile = new File(tempDir, "dest_concurrent");
+
+    int threadCount = 10;
+    CyclicBarrier barrier = new CyclicBarrier(threadCount);
+    ExecutorService executor = Executors.newFixedThreadPool(threadCount);
+    List<Future<?>> futures = new ArrayList<>();

Review Comment:
   If any task throws before the loop over futures completes, the 
ExecutorService may never be shut down, potentially leaking threads and making 
the test suite flaky. Wrap executor creation/usage in a try/finally that always 
calls shutdownNow (or shutdown + awaitTermination), and consider also handling 
interrupted status when awaiting the barrier/future results.



##########
catalogs/hive-metastore-common/src/main/java/org/apache/gravitino/hive/kerberos/FetchFileUtils.java:
##########
@@ -45,7 +47,12 @@ public static void fetchFileFromUri(
           break;
 
         case "file":
-          Files.createSymbolicLink(destFile.toPath(), new 
File(uri.getPath()).toPath());
+          // Synchronize on the canonical dest path to prevent concurrent 
threads from
+          // racing between deleteIfExists and createSymbolicLink for the same 
keytab file.
+          synchronized (destFile.getAbsolutePath().intern()) {
+            Files.deleteIfExists(destFile.toPath());
+            Files.createSymbolicLink(destFile.toPath(), new 
File(uri.getPath()).toPath());
+          }

Review Comment:
   The lock comment says "canonical" path, but the code synchronizes on 
destFile.getAbsolutePath(). That means the same file can be referenced via 
different absolute spellings (e.g., symlinked parent dirs), bypassing the lock. 
Also, using String.intern() for path-based locking can retain arbitrary strings 
for the lifetime of the JVM; prefer a dedicated lock map keyed by a normalized 
Path (e.g., toRealPath()/normalize) instead of interning.



##########
catalogs/hive-metastore-common/src/test/java/org/apache/gravitino/hive/kerberos/TestFetchFileUtils.java:
##########
@@ -0,0 +1,157 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.hive.kerberos;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.concurrent.CyclicBarrier;
+import java.util.concurrent.ExecutorService;
+import java.util.concurrent.Executors;
+import java.util.concurrent.Future;
+import org.apache.hadoop.conf.Configuration;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Disabled;
+import org.junit.jupiter.api.Test;
+import org.junit.jupiter.api.io.TempDir;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TestFetchFileUtils {
+
+  private static final Logger LOG = 
LoggerFactory.getLogger(TestFetchFileUtils.class);
+  private static final int MAX_RETRIES = 8;
+  private static final long INITIAL_RETRY_DELAY_MS = 1000;
+
+  @TempDir File tempDir;
+
+  @Test
+  public void testLinkLocalFile() throws Exception {
+    File srcFile = new File(tempDir, "source");
+    Assertions.assertTrue(srcFile.createNewFile());
+    File destFile = new File(tempDir, "dest");
+
+    FetchFileUtils.fetchFileFromUri(srcFile.toURI().toString(), destFile, 10, 
new Configuration());
+    Assertions.assertTrue(Files.isSymbolicLink(destFile.toPath()));
+    Assertions.assertEquals(srcFile.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+  }
+
+  @Test
+  public void testConcurrentSymlinkCreation() throws Exception {
+    File srcFile = new File(tempDir, "source_concurrent");
+    Assertions.assertTrue(srcFile.createNewFile());
+    File destFile = new File(tempDir, "dest_concurrent");
+
+    int threadCount = 10;
+    CyclicBarrier barrier = new CyclicBarrier(threadCount);
+    ExecutorService executor = Executors.newFixedThreadPool(threadCount);
+    List<Future<?>> futures = new ArrayList<>();
+
+    for (int i = 0; i < threadCount; i++) {
+      futures.add(
+          executor.submit(
+              () -> {
+                try {
+                  barrier.await();
+                  FetchFileUtils.fetchFileFromUri(
+                      srcFile.toURI().toString(), destFile, 10, new 
Configuration());
+                } catch (Exception e) {
+                  throw new RuntimeException(e);
+                }
+              }));
+    }
+
+    for (Future<?> future : futures) {
+      future.get();
+    }
+    executor.shutdown();
+
+    Assertions.assertTrue(Files.isSymbolicLink(destFile.toPath()));
+    Assertions.assertEquals(srcFile.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+  }
+
+  @Test
+  public void testIdempotentSymlinkCreation() throws Exception {
+    File srcFile = new File(tempDir, "source_idempotent");
+    Assertions.assertTrue(srcFile.createNewFile());
+    File destFile = new File(tempDir, "dest_idempotent");
+
+    Configuration conf = new Configuration();
+    String uri = srcFile.toURI().toString();
+
+    FetchFileUtils.fetchFileFromUri(uri, destFile, 10, conf);
+    Assertions.assertTrue(Files.isSymbolicLink(destFile.toPath()));
+
+    // Second call to the same dest should succeed without error
+    FetchFileUtils.fetchFileFromUri(uri, destFile, 10, conf);
+    Assertions.assertTrue(Files.isSymbolicLink(destFile.toPath()));
+    Assertions.assertEquals(srcFile.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+  }
+
+  @Test
+  public void testSymlinkReplacedWithDifferentTarget() throws Exception {
+    File srcFileA = new File(tempDir, "source_a");
+    File srcFileB = new File(tempDir, "source_b");
+    Assertions.assertTrue(srcFileA.createNewFile());
+    Assertions.assertTrue(srcFileB.createNewFile());
+    File destFile = new File(tempDir, "dest_replace");
+
+    Configuration conf = new Configuration();
+
+    FetchFileUtils.fetchFileFromUri(srcFileA.toURI().toString(), destFile, 10, 
conf);
+    Assertions.assertEquals(srcFileA.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+
+    FetchFileUtils.fetchFileFromUri(srcFileB.toURI().toString(), destFile, 10, 
conf);
+    Assertions.assertEquals(srcFileB.toPath(), 
Files.readSymbolicLink(destFile.toPath()));
+  }
+
+  @Test
+  @Disabled("The network errors in CI maybe cause the test failed")
+  public void testDownloadFromHTTP() throws Exception {
+    File destFile = new File(tempDir, "dest_http");
+    String fileUrl = "https://downloads.apache.org/hadoop/common/KEYS";;
+    Configuration conf = new Configuration();

Review Comment:
   This test is permanently disabled and also depends on an external network 
endpoint (downloads.apache.org), so it doesn't provide reliable coverage and 
adds dead/maintenance-heavy retry logic to the unit test suite. Prefer removing 
it, or replacing it with a deterministic test (e.g., using a local in-process 
HTTP server) that can run in CI without @Disabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to