Copilot commented on code in PR #10826:
URL: https://github.com/apache/gravitino/pull/10826#discussion_r3129115493
##########
spark-connector/spark-common/src/main/java/org/apache/gravitino/spark/connector/catalog/BaseCatalog.java:
##########
@@ -258,6 +278,15 @@ public Table loadTable(Identifier ident) throws
NoSuchTableException {
}
}
+ @VisibleForTesting
+ static boolean isBuiltinDataSourceReference(Identifier ident) {
+ String[] namespace = ident.namespace();
+ if (namespace.length != 1) {
+ return false;
+ }
+ return
BUILTIN_DATASOURCE_FORMATS.contains(namespace[0].toLowerCase(Locale.ROOT));
+ }
+
Review Comment:
`isBuiltinDataSourceReference` only checks whether the *schema* namespace
equals a built-in format name. Since this catalog treats `ident.namespace()[0]`
as the Gravitino schema (see `getDatabase()`/`validateNamespace()`), any
legitimate schema named `parquet`/`csv`/… would now become unreachable:
`loadTable` will always throw `NoSuchTableException` and `tableExists` will
always return false, potentially misrouting `SELECT * FROM parquet.t` into
Spark’s DataSource shortcut resolution. Consider tightening the predicate
(e.g., also require the identifier name to look like a path/URI) or checking
whether a Gravitino schema with that name exists before short-circuiting.
```suggestion
return
BUILTIN_DATASOURCE_FORMATS.contains(namespace[0].toLowerCase(Locale.ROOT))
&& looksLikePathOrUri(ident.name());
}
private static boolean looksLikePathOrUri(String name) {
if (StringUtils.isBlank(name)) {
return false;
}
return name.contains("/")
|| name.contains("\\")
|| name.startsWith("./")
|| name.startsWith("../")
|| hasUriScheme(name);
}
private static boolean hasUriScheme(String value) {
int colonIndex = value.indexOf(':');
if (colonIndex <= 0) {
return false;
}
for (int index = 0; index < colonIndex; index++) {
char current = value.charAt(index);
if (index == 0) {
if (!Character.isLetter(current)) {
return false;
}
} else if (!Character.isLetterOrDigit(current)
&& current != '+'
&& current != '-'
&& current != '.') {
return false;
}
}
return true;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]