[I] [Java] Type-ids in UnionVector are erroneously coupled to the Arrow types of the underlying vectors [arrow-java]

via GitHub Tue, 26 Nov 2024 11:15:56 -0800


jarohen opened a new issue, #108:
URL: https://github.com/apache/arrow-java/issues/108


   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   re: https://lists.apache.org/thread/z89xlvw7v1rwq89gknflhsj3c65x20kd
   
   It seems that the UnionVector implementation (particularly 
`initializeChildrenFromFields` (#29848), `getVector`, `getVectorByType`, 
`setSafe` etc) assumes that the type-id is always based on the ArrowType, but 
the Schema.fbs spec is more lenient - users have the choice to use whatever 
type-ids they require.
   
   For example, in XTDB, we're trying to represent an algebraic data type (ADT) 
of 'put', 'delete' and 'erase' events as a sparse union. Delete and erase have 
the same type, so UnionVector currently expects them to be the same type-id 
(whereas, in DenseUnionVector, we can use type-ids 0, 1 and 2).
   
   Would there be an appetite for (potentially relatively significant) changes 
to UnionVector to make it behave this way? We could perhaps consider bringing 
it more in line with DenseUnionVector which seems closer to the spec. Would be 
happy to work on it if so.
   
   Cheers,
   
   James/Finn (@FiV0)
   
   ### Component(s)
   
   Java


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] [Java] Type-ids in UnionVector are erroneously coupled to the Arrow types of the underlying vectors [arrow-java]

Reply via email to