fkgruber opened a new issue, #44408:
URL: https://github.com/apache/arrow/issues/44408

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   If I have a dataset stored on a local folder that I downloaded with python 
hugging face's datasets module how I can I load in in the R arrow package?
   This is the content of the directory:
   
   > dataset_info.json        
   > state.json               
   > data-00000-of-00001.arrow
   > 
   
   I tried every function I could think without success
   
   `ds=open_dataset("mydata")`
   
   > Error in `open_dataset()`:
   > ! Invalid: Error creating dataset. Could not read schema from 
'./Documents/general/mydata/data-00000-of-00001.arrow'. Is this a 'parquet' 
file?: Could not open Parquet input source 
'./Documents/general/mydata/data-00000-of-00001.arrow': Parquet magic bytes not 
found in footer. Either the file is corrupted or this is not a parquet file.
   > ℹ Did you mean to specify a 'format' other than the default (parquet)?
   > Run `rlang::last_trace()` to see where the error occurred.
   > 
   
   `ds=open_dataset("mydata",format="arrow")`
   
   > Error in `open_dataset()`:
   > ! Invalid: Error creating dataset. Could not read schema from 
'./Documents/general/mydata/data-00000-of-00001.arrow'. Is this a 'ipc' file?: 
Could not open IPC input source 
'./Documents/general/mydata/data-00000-of-00001.arrow': Not an Arrow file
   > Run `rlang::last_trace()` to see where the error occurred.
   
   `ds=open_dataset("mydata",format="ipc")`
   
   > Error in `open_dataset()`:
   > ! Invalid: Error creating dataset. Could not read schema from 
'./Documents/general/mydata/data-00000-of-00001.arrow'. Is this a 'ipc' file?: 
Could not open IPC input source 
'./Documents/general/mydata/data-00000-of-00001.arrow': Not an Arrow file
   > Run `rlang::last_trace()` to see where the error occurred.
   
   `ds=open_dataset("mydata",format="feather")`
   
   > Error in `open_dataset()`:
   > ! Invalid: Error creating dataset. Could not read schema from 
'./Documents/general/mydata/data-00000-of-00001.arrow'. Is this a 'ipc' file?: 
Could not open IPC input source 
'./Documents/general/mydata/data-00000-of-00001.arrow': Not an Arrow file
   > Run `rlang::last_trace()` to see where the error occurred.
   
   `ds=open_dataset("mydata",format="json")`
   
   > Error in `open_dataset()`:
   > ! Invalid: Error creating dataset. Could not read schema from 
'./Documents/general/mydata/data-00000-of-00001.arrow'. Is this a 'json' file?: 
Could not open JSON input source 
'./Documents/general/mydata/data-00000-of-00001.arrow': Invalid: JSON parse 
error: Invalid value. in row 0
   > Run `rlang::last_trace()` to see where the error occurred.
   > 
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to