asheeshgarg commented on issue #6415: URL: https://github.com/apache/iceberg/issues/6415#issuecomment-1349350170
@nastra filled in the missing bits So this schema that is define in Iceberg entity_status is UTF8 Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Utf8, This is what is been generated by batchRoot Schema Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Int(32, true), There are other fields also coming as Int which are UTF8 it throws a error like below because of type mismatch. SPark Able to read the data fine java.lang.IndexOutOfBoundsException: index: 0, length: 8388608 (expected: range(0, 15888)) at org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701) at org.apache.arrow.memory.ArrowBuf.setBytes(ArrowBuf.java:955) at org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:451) at org.apache.arrow.vector.BaseFixedWidthVector.setValueCount(BaseFixedWidthVector.java:732) at org.apache.arrow.vector.VectorSchemaRoot.setRowCount(VectorSchemaRoot.java:240) at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:86) Expected behavior batches should return the data of schema type defined in iceberg/hive. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org