martin-traverse opened a new issue, #615:
URL: https://github.com/apache/arrow-java/issues/615

   ### Describe the enhancement requested
   
   Hi,
   
   We need full Avro read / write support in our project and are working on an 
implementation. I had a look at what already exists in arrow-java, I think it 
would be fairly straightforward to extend what is there to get full read/write 
support in the Arrow Java project. Here is what I am proposing:
   
   * A set of producers to handle the Avro data structures, mirroring the 
existing consumers
   * Handle the high level file structure (header, embedded schema and block 
structure)
   * Support for compressed blocks (using the existing codecs in the Avro 
project)
   * High level APIs for read / write, including incremental read (block by 
block, corresponding to the VSR)
   
   The last point is important for us because we handle streaming data, if we 
can check a whole block is available before reading it we should be able to 
prevent avoid on IO calls.
   
   If I draft a PR along these lines, would there be interest to help me refine 
it and get it into arrow-java? If not we can do our own implementation which 
will be simpler because we don't need all the features and data types, but I 
think the delta is not that large and IMO it would be a good thing to have in 
the Arrow Java toolkit.
   
   Thoughts welcome!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to