yihua commented on code in PR #18481:
URL: https://github.com/apache/hudi/pull/18481#discussion_r3050513758


##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java:
##########
@@ -637,6 +637,14 @@ private Table instantiateHiveTable(ObjectPath tablePath, 
CatalogBaseTable table,
     }
 
     HoodieFileFormat baseFileFormat = HoodieFileFormat.PARQUET;
+    String baseFileFormatStr = 
properties.get(HoodieTableConfig.BASE_FILE_FORMAT.key());
+    if (baseFileFormatStr != null) {
+      HoodieFileFormat configuredFormat = 
HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase());
+      if (configuredFormat == HoodieFileFormat.LANCE) {

Review Comment:
   🤖 `HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase())` will throw an 
`IllegalArgumentException` if the property contains an unrecognized value. 
Could you wrap this in a try-catch or use a safer lookup, so the error message 
is user-friendly instead of a raw enum parse failure?
   
   🤖 *This review comment was generated by an AI agent and may contain 
mistakes. Please verify any suggestions before applying.*



##########
hudi-flink-datasource/hudi-flink/src/main/java/org/apache/hudi/table/catalog/HoodieHiveCatalog.java:
##########
@@ -637,6 +637,14 @@ private Table instantiateHiveTable(ObjectPath tablePath, 
CatalogBaseTable table,
     }
 
     HoodieFileFormat baseFileFormat = HoodieFileFormat.PARQUET;
+    String baseFileFormatStr = 
properties.get(HoodieTableConfig.BASE_FILE_FORMAT.key());
+    if (baseFileFormatStr != null) {
+      HoodieFileFormat configuredFormat = 
HoodieFileFormat.valueOf(baseFileFormatStr.toUpperCase());
+      if (configuredFormat == HoodieFileFormat.LANCE) {
+        throw new 
HoodieValidationException(HoodieFileFormat.LANCE_SPARK_ONLY_ERROR_MSG);
+      }
+      baseFileFormat = configuredFormat;

Review Comment:
   🤖 This line `baseFileFormat = configuredFormat;` changes behavior beyond the 
Lance guard — for ORC or HFile tables, the `baseFileFormat` was previously 
always PARQUET (hardcoded on the line above), which was used to resolve 
InputFormat, OutputFormat, and SerDe class names. Now those tables will get 
their actual format's classes. This might be a desirable fix, but it's a silent 
behavior change that could break Hive catalog registration for non-Parquet 
tables. Was this intentional? If the goal here is only the Lance guard, I'd 
suggest keeping the existing default-to-PARQUET behavior and only adding the 
Lance rejection.
   
   🤖 *This review comment was generated by an AI agent and may contain 
mistakes. Please verify any suggestions before applying.*



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to