anashen opened a new issue, #48066:
URL: https://github.com/apache/arrow/issues/48066

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hi,
   
   It looks like the team may already be aware of this (#47981), but I wanted 
to create a separate issue for the R package.
   
   Our R package Seurat depends on `read_parquet`. With arrow v22.0.0, 
`read_parquet` throws `Error: Invalid: Invalid number of indices: 0` on only 
some files. I can confirm that arrow v21.0.0.1 has no issues on the same files. 
An example of the type of file raising an error is included below.
   
   Any fixes for this, or tips on getting around it, will be much appreciated. 
However, the data in question is not generated by us, so we have no control 
regarding file format.
   
   ```zsh
   # publicly available at 
   # 
https://www.10xgenomics.com/support/software/xenium-onboard-analysis/latest/resources/xenium-example-data#test-data-v4-0
   curl -O 
https://cf.10xgenomics.com/samples/xenium/4.0.0/Xenium_V1_Protein_Human_Kidney_tiny/Xenium_V1_Protein_Human_Kidney_tiny_outs.zip
   # `transcripts.parquet` is located inside the zipped folder
   ```
   
   ```R
   
read_parquet("/fakepath/Xenium_V1_Protein_Human_Kidney_tiny_outs/transcripts.parquet")
   # Error: Invalid: Invalid number of indices: 0
   traceback()
   # 6: stop(e)
   # 5: value[[3L]](cond)
   # 4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
   # 3: tryCatchList(expr, classes, parentenv, handlers)
   # 2: tryCatch(reader$ReadTable(), error = read_compressed_error)
   # 1: 
read_parquet("/fakepath/Xenium_V1_Protein_Human_Kidney_tiny_outs/transcripts.parquet")
   ```
   
   <details><summary>sessionInfo</summary>
   
   ```
   R version 4.5.1 (2025-06-13)
   Platform: aarch64-apple-darwin20
   Running under: macOS Sequoia 15.6.1
   
   Matrix products: default
   BLAS:   
/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
 
   LAPACK: 
/Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib;
  LAPACK version 3.12.1
   
   locale:
   [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
   
   time zone: America/New_York
   tzcode source: internal
   
   attached base packages:
   [1] stats     graphics  grDevices utils     datasets  methods   base     
   
   other attached packages:
   [1] arrow_22.0.0
   
   loaded via a namespace (and not attached):
    [1] tidyselect_1.2.1 bit_4.6.0        compiler_4.5.1   magrittr_2.0.3  
    [5] assertthat_0.2.1 R6_2.6.1         cli_3.6.5        glue_1.8.0      
    [9] bit64_4.6.0-1    vctrs_0.6.5      lifecycle_1.0.4  rlang_1.1.6     
   [13] purrr_1.1.0  
   ```
   
   </details>
   
   
   Thanks!
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to