ottomata commented on PR #12424: URL: https://github.com/apache/iceberg/pull/12424#issuecomment-2839035884
Yes!!! I'm excited about this feature! > Steven wondered whether Iceberg's Schema is the best schema input format for the user. He suggested Flink's RowType instead. This would be nice for Wikimedia Foundation. We use JSONSchema and have tooling to [automatically convert](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities/src/main/java/org/wikimedia/eventutilities/core/event/types/JsonSchemaConverter.java#101) to Flink [TypeInformation](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/TypeInformationSchemaConversions.java ) (and also Table API [DataTypes](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/DataTypeSchemaConversions.java)). We do a similar thing to what you are all describing here when [creating and evolving Hive Parquet tables with Spark](https://github.com/wikimedia/analytics-refinery-source/blob/master/refinery-spark/src/main/scala/org/wikimedia/analytics/refinery/spark/sql/TableSchemaManager.scala#L35-L82). We provide [conversions from JSONSchema](https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-spark/src/main/java/org/wikimedia/eventutilities/spark/sql/DataTypeSchemaConversions.java) to Spark StructType (analogous to Flink's RowType), and then use the StructType to determine how to create or evolve the table via DDL statements. This allows us to automate ingestion into various downstream systems from JSON streams in Kafka by using our managed event JSONSchema registry as the canonical data schema source. > Avro is a common input schema type I might be ignorant here, but I think if you support RowType, you will also get Avro support? I haven't used it, but Flink has a built in [schema converter for Avro to TypeInformation](https://github.com/apache/flink/blob/master/flink-formats/flink-avro/src/main/java/org/apache/flink/formats/avro/typeutils/AvroSchemaConverter.java)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org