martin-traverse opened a new issue, #698:
URL: https://github.com/apache/arrow-java/issues/698

   ### Describe the enhancement requested
   
   Raising this ticket as a second step on Avro support, following on from 
#615.  On this ticket I'd like to cover:
   
   1)   Round trip test cases for all supported data types (schema and data)
   2)   Fix nullability handling - Avro union of [ null, type ] should be 
handled as a single nullable field / vector, not create an Arrow union
   3)   Expose an API for creating an Arrow schema directly from an Avro schema 
with the same type mapping as the existing consumers
   4)   Expose an API to allow a VSR to be recycled when reading data (the VSR 
should be resized to accommodate an Avro block)
   
   Regarding point 2, I'm think of adding a flag to the AvroToArrowConfig 
class. By default the flag can be false to preserve the current behaviour.
   
   I have started work on this ticket but will need to wait for #638 to merge 
before raising a draft PR.
   
   I think there are two more PRs needed in this series, one to provide a 
high-level API  to read / write whole files, this would use the producers / 
consumers internally, understand Avro's block structure and map one full Arrow 
VSR to one Avro block. The last PR would be to add some extra features 
including compression and dictionary encoding / enums.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to