ottomata commented on PR #12424:
URL: https://github.com/apache/iceberg/pull/12424#issuecomment-2839035884

   Yes!!!  I'm excited about this feature!
   
   > Steven wondered whether Iceberg's Schema is the best schema input format 
for the user. He suggested Flink's RowType instead.
   
   This would be nice for Wikimedia Foundation. We use JSONSchema and have 
tooling to [automatically 
convert](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities/src/main/java/org/wikimedia/eventutilities/core/event/types/JsonSchemaConverter.java#101)
 to Flink 
[TypeInformation](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/TypeInformationSchemaConversions.java
   ) (and also Table API 
[DataTypes](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/DataTypeSchemaConversions.java)).
   
   We do a similar thing to what you are all describing here when [creating and 
evolving Hive Parquet tables with 
Spark](https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-spark/src/main/scala/org/wikimedia/analytics/refinery/spark/sql/TableSchemaManager.scala#L35-L82).
  We provide [conversions from 
JSONSchema](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-spark/src/main/java/org/wikimedia/eventutilities/spark/sql/DataTypeSchemaConversions.java)
 to Spark StructType (analogous to Flink's RowType), and then use the 
StructType to determine how to create or evolve the table via DDL statements.
   
   This allows us to automate ingestion into various downstream systems from 
JSON streams in Kafka by using our managed event JSONSchema registry as the 
canonical data schema source.
   
   > Avro is a common input schema type
   
   I might be ignorant here, but I think if you support RowType, you will also 
get Avro support?  I haven't used it, but Flink has a built in [schema 
converter for Avro to 
TypeInformation](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java)?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to