[GitHub] [iceberg] romanstreamsets commented on issue #6796: AvtoSchemaUtils.convert() method produces Iceberg schema different from that by Hive/Spark

via GitHub Fri, 10 Feb 2023 01:42:12 -0800


romanstreamsets commented on issue #6796:
URL: https://github.com/apache/iceberg/issues/6796#issuecomment-1425523063


   Hi @Fokko 
   That blog is exactly where I got my initial code. I have changed the source 
of my schema from "manual" to Avro schema supplied as a String, because that's 
my use case.
   Something I have found since posting this issue.
   As you reproduced, conversion of Avro Schema renders an Iceberg Schema where 
field IDs are numbered starting from 0. It doesn't matter whether fields are 
required or optional.
   However, at the time when I call catalog.createTable(..., icebergSchema, 
...) method, in the resulting schema fields IDs will be numbered form 1.
   When I call "create table" in Spark/Hive, the schema is also created with 
fields IDs starting from 1.
   So, my workaround currently is:
   ```
   avroConvertedSchema = AvroSchemaUtil.convert(avroSchema);
   table = catalog.tableCreate(..., avroConvertedSchema, ...);
   icebergSchema = table.schema();
   ... // then I use icebergSchema when writing records to the file
   ```
   In this case, avroConvertedSchema is not the same as icebergSchema, exactly 
in field IDs numbering.
   So, the convert() method should number IDs from 1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] romanstreamsets commented on issue #6796: AvtoSchemaUtils.convert() method produces Iceberg schema different from that by Hive/Spark

Reply via email to