viirya opened a new issue, #648:
URL: https://github.com/apache/arrow-java/issues/648

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   We hit an issue on using `VectorLoader` to load some Arrow vectors.
   
   ```
   java.util.NoSuchElementException
           at java.base/java.util.ArrayList$Itr.next(ArrayList.java:970)
           at 
org.apache.arrow.vector.VectorLoader.loadBuffers(VectorLoader.java:104)
           at org.apache.arrow.vector.VectorLoader.load(VectorLoader.java:84)
   ```
   
   The schema of the `VectorSchemaRoot` is `Schema<_0: Utf8 not null>`.
   The field vector in the root is `Utf8` type, not nullable. As it is `Utf8` 
type, `TypeLayout.getTypeBufferCount` reports buffer count 3 for it.
   
   The IPC `ArrowRecordBatch` message to load has one node: `ArrowFieldNode 
[length=1500, nullCount=0]`, and two buffers:
   
   ```
   buffer: ArrowBuf[...], address:....., capacity:..., ArrowBuf
   buffer: ArrowBuf[...], address:....., capacity:..., ArrowBuf
   ```
   
   So when `VectorLoader.loadBuffers` is trying to load buffers by iterating 
the buffer list, it assumes there are 3 buffers but actually there are only 2 
buffers (null buffer doesn't exist). That's why it hits 
`NoSuchElementException`.
   
   I think that an array that in the spec can contain a null bitmap may choose 
to not allocate the validity buffer (also see the 
[doc](https://arrow.apache.org/docs/format/Columnar.html#validity-bitmaps)). So 
the `Utf8` array with 2 buffers is correct by the spec. The issue looks like 
that `VectorLoader` doesn't consider field nullability when loading buffers.
   
   We uses Arrow Java 15.0.2 version. But as I just looked at the current code 
in this repo, looks like current `TypeLayout` has this issue still.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to