piyushdubey opened a new issue, #40258: URL: https://github.com/apache/arrow/issues/40258
### Describe the usage question you have. Please include as many useful details as possible. Hello, I am trying to convert a Delta Table to an Arrow Stream. The table can have any number of parquet files and may or may not be partitioned. I am using Parquet.Net to read Parquet Files. How should I think about parity between parquet files and RecordBatch. Should I create one RecordBatch per parquet file? What should the overall parquet to arrow conversion logic look like? Any pointers? Here's a tentative algorithm I have in mind. 1. Iterate over the list parquet files 2. Read `ParquetRowGroupReader reader = parquetReader.OpenRowGroupReader(rowGroupIndex);` 3. Extract Columns and Add them to a record batch one by one 4. Read RecordBatch into ArrowStreamWriter(). Appreciate any help with this. ### Component(s) C# -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
