yihua commented on code in PR #18481:
URL: https://github.com/apache/hudi/pull/18481#discussion_r3050513758
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java:
##########
@@ -637,6 +637,14 @@ private Table instantiateHiveTable(ObjectPath tablePath,
CatalogBaseTable table,
}
HoodieFileFormat baseFileFormat = HoodieFileFormat.PARQUET;
+ String baseFileFormatStr =
properties.get(HoodieTableConfig.BASE_FILE_FORMAT.key());
+ if (baseFileFormatStr != null) {
+ HoodieFileFormat configuredFormat =
HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase());
+ if (configuredFormat == HoodieFileFormat.LANCE) {
Review Comment:
🤖 `HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase())` will throw an
`IllegalArgumentException` if the property contains an unrecognized value.
Could you wrap this in a try-catch or use a safer lookup, so the error message
is user-friendly instead of a raw enum parse failure?
🤖 *This review comment was generated by an AI agent and may contain
mistakes. Please verify any suggestions before applying.*
##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java:
##########
@@ -637,6 +637,14 @@ private Table instantiateHiveTable(ObjectPath tablePath,
CatalogBaseTable table,
}
HoodieFileFormat baseFileFormat = HoodieFileFormat.PARQUET;
+ String baseFileFormatStr =
properties.get(HoodieTableConfig.BASE_FILE_FORMAT.key());
+ if (baseFileFormatStr != null) {
+ HoodieFileFormat configuredFormat =
HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase());
+ if (configuredFormat == HoodieFileFormat.LANCE) {
+ throw new
HoodieValidationException(HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG);
+ }
+ baseFileFormat = configuredFormat;
Review Comment:
🤖 This line `baseFileFormat = configuredFormat;` changes behavior beyond the
Lance guard — for ORC or HFile tables, the `baseFileFormat` was previously
always PARQUET (hardcoded on the line above), which was used to resolve
InputFormat, OutputFormat, and SerDe class names. Now those tables will get
their actual format's classes. This might be a desirable fix, but it's a silent
behavior change that could break Hive catalog registration for non-Parquet
tables. Was this intentional? If the goal here is only the Lance guard, I'd
suggest keeping the existing default-to-PARQUET behavior and only adding the
Lance rejection.
🤖 *This review comment was generated by an AI agent and may contain
mistakes. Please verify any suggestions before applying.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]