[GitHub] [iceberg] asheeshgarg commented on issue #6415: Vectorized Read Issue

GitBox Tue, 13 Dec 2022 10:10:38 -0800


asheeshgarg commented on issue #6415:
URL: https://github.com/apache/iceberg/issues/6415#issuecomment-1349350170


   @nastra filled in the missing bits
   So this schema that is define in Iceberg entity_status is UTF8
   Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Utf8, 
   This is what is been generated by batchRoot Schema
   Schema<entity_id: Utf8, entity_name: Utf8, entity_status: Int(32, true), 
   
   
   There are other fields also coming as Int which are UTF8 it throws a error 
like below because of type mismatch. SPark Able to read the data fine
   
   java.lang.IndexOutOfBoundsException: index: 0, length: 8388608 (expected: 
range(0, 15888))
           at org.apache.arrow.memory.ArrowBuf.checkIndex(ArrowBuf.java:701)
           at org.apache.arrow.memory.ArrowBuf.setBytes(ArrowBuf.java:955)
           at 
org.apache.arrow.vector.BaseFixedWidthVector.reAlloc(BaseFixedWidthVector.java:451)
           at 
org.apache.arrow.vector.BaseFixedWidthVector.setValueCount(BaseFixedWidthVector.java:732)
           at 
org.apache.arrow.vector.VectorSchemaRoot.setRowCount(VectorSchemaRoot.java:240)
           at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:86)
   
   Expected behavior batches should return the data of schema type defined in 
iceberg/hive. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

[GitHub] [iceberg] asheeshgarg commented on issue #6415: Vectorized Read Issue

Reply via email to