Copilot commented on code in PR #10826:
URL: https://github.com/apache/gravitino/pull/10826#discussion_r3129115493


##########
spark-connector/spark-common/src/main/java/org/apache/gravitino/spark/connector/catalog/BaseCatalog.java:
##########
@@ -258,6 +278,15 @@ public Table loadTable(Identifier ident) throws 
NoSuchTableException {
     }
   }
 
+  @VisibleForTesting
+  static boolean isBuiltinDataSourceReference(Identifier ident) {
+    String[] namespace = ident.namespace();
+    if (namespace.length != 1) {
+      return false;
+    }
+    return 
BUILTIN_DATASOURCE_FORMATS.contains(namespace[0].toLowerCase(Locale.ROOT));
+  }
+

Review Comment:
   `isBuiltinDataSourceReference` only checks whether the *schema* namespace 
equals a built-in format name. Since this catalog treats `ident.namespace()[0]` 
as the Gravitino schema (see `getDatabase()`/`validateNamespace()`), any 
legitimate schema named `parquet`/`csv`/… would now become unreachable: 
`loadTable` will always throw `NoSuchTableException` and `tableExists` will 
always return false, potentially misrouting `SELECT * FROM parquet.t` into 
Spark’s DataSource shortcut resolution. Consider tightening the predicate 
(e.g., also require the identifier name to look like a path/URI) or checking 
whether a Gravitino schema with that name exists before short-circuiting.
   ```suggestion
   
       return 
BUILTIN_DATASOURCE_FORMATS.contains(namespace[0].toLowerCase(Locale.ROOT))
           && looksLikePathOrUri(ident.name());
     }
   
     private static boolean looksLikePathOrUri(String name) {
       if (StringUtils.isBlank(name)) {
         return false;
       }
   
       return name.contains("/")
           || name.contains("\\")
           || name.startsWith("./")
           || name.startsWith("../")
           || hasUriScheme(name);
     }
   
     private static boolean hasUriScheme(String value) {
       int colonIndex = value.indexOf(':');
       if (colonIndex <= 0) {
         return false;
       }
   
       for (int index = 0; index < colonIndex; index++) {
         char current = value.charAt(index);
         if (index == 0) {
           if (!Character.isLetter(current)) {
             return false;
           }
         } else if (!Character.isLetterOrDigit(current)
             && current != '+'
             && current != '-'
             && current != '.') {
           return false;
         }
       }
   
       return true;
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to