pvary commented on code in PR #14040:
URL: https://github.com/apache/iceberg/pull/14040#discussion_r2348289165


##########
parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java:
##########
@@ -1281,12 +1304,24 @@ public ReadBuilder withNameMapping(NameMapping 
newNameMapping) {
 
     @Override
     public ReadBuilder setRootType(Class<? extends StructLike> rootClass) {
-      throw new UnsupportedOperationException("Custom types are not yet 
supported");
+      Preconditions.checkArgument(
+          this.internalReader != null, "Cannot set root type without using an 
Internal Reader");
+      Preconditions.checkArgument(
+          this.readerFunc == null && this.readerFuncWithSchema == null,
+          "Setting root type is not compatible with setting a reader 
function");
+      internalReader.setRootType(rootClass);

Review Comment:
   Oh.. I thought, that I have answered this one. Sorry for the late reply.
   
   >  I was wondering if there is a practical difference between making a 
TriFunction and just using the InternalReader.
   
   In my File Format API PR, I need to push some of the builder parameters to 
the engine specific code, so the FormatModel can create the correct 
readerFuncion/writerFunction. For example: Iceberg schema, File schema, Engine 
schema, Constant fields (former idToConstant map), delete filters (which are 
needed for creating vectorized Parquet readers).
   
   In all of these cases, I stick to the following pattern:
   - Collect the input in the Builder
   - In the build method, based on the parameter decide which function will be 
used
   - Call the function with the collected parameters
   
   This pattern works well when we have multiple possible read/write functions 
using some overlapping parameters.
   
   If we follow the `.useInternalReader(...)` pattern everywhere, we need to 
split the parameters to 2 groups:
   - Common parameters - these will be the parameters provided to the 
reader/writer functions
   - Specific parameters - these will be collected by the reader/writer 
objects, and the builder method will be only a wrapper around the 
InternalReader type object.
   
   I rather stick to a single model.
   
   WDYT?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to