nsivabalan commented on code in PR #18132:
URL: https://github.com/apache/hudi/pull/18132#discussion_r3048824555


##########
hudi-common/src/main/java/org/apache/parquet/avro/HoodieAvroParquetReaderBuilder.java:
##########
@@ -67,13 +74,19 @@ public HoodieAvroParquetReaderBuilder<T> 
withCompatibility(boolean enableCompati
     return this;
   }
 
+  public HoodieAvroParquetReaderBuilder<T> withTableSchema(Schema tableSchema) 
{
+    this.tableSchema = tableSchema;
+    return this;
+  }
+
   @Override
   protected ReadSupport<T> getReadSupport() {
     if (isReflect) {
       conf.setBoolean(AvroReadSupport.AVRO_COMPATIBILITY, false);
     } else {
       conf.setBoolean(AvroReadSupport.AVRO_COMPATIBILITY, enableCompatibility);
     }
-    return new HoodieAvroReadSupport<>(model);
+    return new HoodieAvroReadSupport<>(model, 
Option.ofNullable(tableSchema).map(schema -> 
getAvroSchemaConverter(conf).convert(schema)),

Review Comment:
   if hadoopConf has the value for `hasLogicalTsField`, we can also avoid the 
additional call in L90



##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/table/action/commit/HoodieMergeHelper.java:
##########
@@ -86,7 +87,8 @@ public void runMerge(HoodieTable<?, ?, ?, ?> table,
     HoodieFileReader bootstrapFileReader = null;
 
     Schema writerSchema = mergeHandle.getWriterSchemaWithMetaFields();
-    Schema readerSchema = baseFileReader.getSchema();
+    Schema readerSchema = 
AvroSchemaUtils.getRepairedSchema(baseFileReader.getSchema(), writerSchema);

Review Comment:
   but why can't we add it to hadoopConfiguration thats part of 
`table.getHadoopConf()` in the driver and then fetch it from here



##########
hudi-common/src/main/java/org/apache/hudi/io/storage/HoodieAvroParquetReader.java:
##########
@@ -154,23 +158,70 @@ private static Configuration 
tryOverrideDefaultConfigs(Configuration conf) {
     return conf;
   }
 
-  private ClosableIterator<IndexedRecord> 
getIndexedRecordIteratorInternal(Schema schema, Option<Schema> requestedSchema) 
throws IOException {
+  private ClosableIterator<IndexedRecord> 
getIndexedRecordIteratorInternal(Schema schema, Option<Schema> renamedColumns) 
throws IOException {
     // NOTE: We have to set both Avro read-schema and projection schema to make
     //       sure that in case the file-schema is not equal to read-schema 
we'd still
     //       be able to read that file (in case projection is a proper one)
-    if (!requestedSchema.isPresent()) {
+    Schema repairedFileSchema = getRepairedSchema(getSchema(), schema);

Review Comment:
   When we are instantiating the base file reader in L84 in HoodieMergeHelper, 
if we can embed a boolean flag in hadoopConf, we can fetch it again here and 
avoid repair calls for tables w/o any logical type. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to