anuragmantri commented on issue #10635: URL: https://github.com/apache/iceberg/issues/10635#issuecomment-2265778151
I walked through the code and I was also able to reproduce this issue for parquet writes with a test. ``` java.lang.IllegalArgumentException: Invalid UUID string: d��Iu���>�M�` at java.base/java.util.UUID.fromString(Unknown Source) at org.apache.iceberg.spark.data.SparkParquetWriters$UUIDWriter.write(SparkParquetWriters.java:426) at org.apache.iceberg.spark.data.SparkParquetWriters$UUIDWriter.write(SparkParquetWriters.java:411) at org.apache.iceberg.parquet.ParquetValueWriters$StructWriter.write(ParquetValueWriters.java:581) at org.apache.iceberg.parquet.ParquetWriter.add(ParquetWriter.java:135) ``` It looks like the visitor incorrectly casts byte array to string because of our conversion to spark types [here](https://github.com/apache/iceberg/blob/af75440da8d6e7b509b3197251c965543017b015/spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/TypeToSparkType.java#L111). Should we do this casting correctly at a higher level than `SparkParquetWriters`? @RussellSpitzer @nastra -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org