amoeba opened a new issue, #46729:
URL: https://github.com/apache/arrow/issues/46729

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The docstring for `InMemoryDataset` indicate you can create one from a 
`RecordBatchReader`:
   
   
https://github.com/apache/arrow/blob/5a529166e399c4b35fb2768278e8326bbcb5a9a8/python/pyarrow/_dataset.pyx#L990-L999
   
   However, if you try this you currently get an error saying you cannot:
   
   ```python
   >>> ds = ds.InMemoryDataset(rbr)
   Traceback (most recent call last):
     File "<python-input-37>", line 1, in <module>
       ds = ds.InMemoryDataset(rbr)
     File "pyarrow/_dataset.pyx", line 1038, in 
pyarrow._dataset.InMemoryDataset.__init__
   TypeError: Expected a table, batch, or list of tables/batches instead of the 
given type: RecordBatchReader
   ```
   
   I don't think we allow simple construction of an InMemoryDataset from a 
RecordBatchReader because that violates the assumption Datasets about sources 
being re-readable (not one-shot like RBR). But I don't see why the 
InMemoryDataset constructor can't consume the RecordBatchReader and construct a 
Table from it.
   
   ### Component(s)
   
   Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to