asheeshgarg commented on issue #6415:
URL: https://github.com/apache/iceberg/issues/6415#issuecomment-1349350170
@nastra filled in the missing bits
So this schema that is define in Iceberg entity_status is UTF8
Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Utf8,
This is what is been generated by batchRoot Schema
Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Int(32, true),
There are other fields also coming as Int which are UTF8 it throws a error
like below because of type mismatch. SPark Able to read the data fine
java.lang.IndexOutOfBoundsException: index: 0, length: 8388608 (expected:
range(0, 15888))
at org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701)
at org.apache.arrow.memory.ArrowBuf.setBytes(ArrowBuf.java:955)
at
org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:451)
at
org.apache.arrow.vector.BaseFixedWidthVector.setValueCount(BaseFixedWidthVector.java:732)
at
org.apache.arrow.vector.VectorSchemaRoot.setRowCount(VectorSchemaRoot.java:240)
at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:86)
Expected behavior batches should return the data of schema type defined in
iceberg/hive.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]