Re: [PR] Construct a writer tree [iceberg-python]

via GitHub Mon, 09 Oct 2023 16:56:08 -0700


rdblue commented on code in PR #40:
URL: https://github.com/apache/iceberg-python/pull/40#discussion_r1351005623



##########
pyiceberg/avro/resolver.py:
##########
@@ -192,7 +195,26 @@ def visit_binary(self, binary_type: BinaryType) -> Writer:
         return BinaryWriter()
 
 
-def resolve(
+CONSTRUCT_WRITER_VISITOR = ConstructWriter()
+
+
+def resolve_writer(
+    struct_schema: Union[Schema, IcebergType],
+    write_schema: Union[Schema, IcebergType],
+) -> Writer:
+    """Resolve the file and read schema to produce a reader.
+
+    Args:
+        struct_schema (Schema | IcebergType): The schema of the Avro file.
+        write_schema (Schema | IcebergType): The requested read schema which 
is equal, subset or superset of the file schema.

Review Comment:
   @Fokko, I think the names are still confusing here. When I see `data_schema` 
I would expect it to be the schema of the data that is being written. And also 
`write_schema` makes me think of Avro, where a "write schema" is typically the 
file schema.
   
   Since this is a new concept (writing records into another schema by ignoring 
some columns), I think it would be good to have really clear names. I'd say 
`file_schema` and `data_schema`, but if you came to the opposite conclusion 
about the meaning of `data_schema`, then maybe we should have `record_schema` 
and `file_schema`? Or possibly `inmemory_schema` and `file_schema`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Construct a writer tree [iceberg-python]

Reply via email to