luoyuxia opened a new issue, #735:
URL: https://github.com/apache/arrow-java/issues/735

   ### Describe the enhancement requested
   
   I'm working on https://github.com/alibaba/fluss/issues/107 which enable 
convert Fluss arow structure data to Parquet directly but found the API missing 
in here. 
   Althogh #14151 supports to write from ArrowReader to file, it read from the 
ArrowReader, write and close the file direclty. But it's in a very 
coarse-grained , we almost have no control about the writing.  Sometime, we 
want to control when to close the written parquet. Also it require 
`ArrowReader`, but if the arrow RecordBatch is read continuously from remote 
server . It's not easy to constuct a `ArrowReader`. 
   So, I think we may need to support the interface to write Arrow RecordBatch 
to Parquet via [virtual ::arrow::Status WriteRecordBatch(const 
::arrow::RecordBatch& batch) = 
0](https://github.com/apache/arrow/blob/c506b0806bd2b90410400d349a16bc4a5b1dd51c/cpp/src/parquet/arrow/writer.h#L125)
   Just to as a show case, the api may look like:
   ```
   public class ArrowBatchParquetWriter {
      void write(RecordBatch recordbatch);
      void close()
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to