alippai opened a new issue, #45422:
URL: https://github.com/apache/arrow/issues/45422
### Describe the usage question you have. Please include as many useful
details as possible.
I tried this with pyarrow 19:
```python
import pyarrow.feather as pf
t = ...
pf.write_feather(t, 'test.feather', chunksize=1024*1024)
len(pf.read_table('test.feather').to_batches()[0]) // 65k
pf.write_feather(t, 'test2.feather', chunksize=256*1024)
len(pf.read_table('test2.feather').to_batches()[0]) // 65k
```
I expected the files to be different (different compressed sizes), but they
are byte-by-byte identical. As a consequence the batch sizes are lost when
reading the data back.
Do I assume correctly the file should consist of chunksize long buffers for
each column (per recordbatch) and these buffers are independently compressed
using lz4 or zstd?
### Component(s)
Python, C++, Format
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]