Zechariah2001 opened a new issue, #43322:
URL: https://github.com/apache/arrow/issues/43322

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   I'm trying to lower the time cost of writing RecordBatch into a single 
parquet file, the current method is using  
`parquet::arrow::FileWriter::WriteRecordBatch` with `set_use_threads(true)`, 
and the result is not very satisfying.
   
   <img width="587" alt="threads" 
src="https://github.com/user-attachments/assets/c82201ae-8be2-4ca3-84fc-1e079c4fd34b";>
   <img width="634" alt="CPU_usage" 
src="https://github.com/user-attachments/assets/e7414fe3-38d7-4a40-966f-535772b939c3";>
   
   I know that `set_use_threads(true)` is serializing row groups in columns, 
and I still need to encode row groups one by one. So I wonder if it's possible 
to encode different row groups in parallel?
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to