matthewgson opened a new issue, #44138:
URL: https://github.com/apache/arrow/issues/44138

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   The column selection with `select()` in combination with `tidyselect` 
generates a bug when multiple arguments are passed to select().
   
   Reprex:
   
   ```r
   library(arrow)
   library(tidyverse)
   
   iris_arrow <- as_arrow_table(iris)
   
   iris_arrow |> 
     select(!ends_with(".Length"), 
            !Sepal.Width) |> 
     names()
   [1] "Sepal.Width"  "Petal.Width"  "Species"      "Sepal.Length" 
"Petal.Length"
   ```
   In this example, even though Sepal.Width is supposed to be excluded, it 
still appears in the result.
   
   
   A current workaround is to split `select()` function calls with pipe:
   
   ```r
   iris_arrow |> 
     select(!ends_with(".Length")) |> 
     select(!Sepal.Width) |> 
     names()
   [1] "Petal.Width" "Species"  
   ```
   
   The same occurs when data is transferred to `duckdb`:
   
   ```r
   iris_db <-  iris_arrow |> to_duckdb()
   
   iris_db |> 
     select(!ends_with(".Length"),
            !Sepal.Width) |> 
     colnames() # why names() doesn't work?
   [1] "Sepal.Width"  "Petal.Width"  "Species"      "Sepal.Length" 
"Petal.Length"
   ```
   
   
   
   SessionInfo:
   
   ```r
   sessionInfo()
   R version 4.4.1 (2024-06-14)
   Platform: aarch64-apple-darwin20
   Running under: macOS Sonoma 14.6.1
   
   Matrix products: default
   BLAS:   
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
 
   LAPACK: 
/Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;
  LAPACK version 3.12.0
   
   locale:
   [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
   
   time zone: America/New_York
   tzcode source: internal
   
   attached base packages:
   [1] stats     graphics  grDevices utils     datasets  methods   base     
   
   other attached packages:
    [1] lubridate_1.9.3 forcats_1.0.0   stringr_1.5.1   dplyr_1.1.4     
purrr_1.0.2     readr_2.1.5     tidyr_1.3.1    
    [8] tibble_3.2.1    ggplot2_3.5.1   tidyverse_2.0.0 arrow_16.1.0   
   
   loaded via a namespace (and not attached):
    [1] bit_4.0.5         gtable_0.3.5      compiler_4.4.1    tidyselect_1.2.1  
assertthat_0.2.1  scales_1.3.0     
    [7] R6_2.5.1          generics_0.1.3    munsell_0.5.1     DBI_1.2.3         
pillar_1.9.0      tzdb_0.4.0       
   [13] rlang_1.1.4       utf8_1.2.4        stringi_1.8.4     bit64_4.0.5       
timechange_0.3.0  cli_3.6.3        
   [19] withr_3.0.1       magrittr_2.0.3    grid_4.4.1        dbplyr_2.5.0      
hms_1.1.3         lifecycle_1.0.4  
   [25] vctrs_0.6.5       glue_1.7.0        data.table_1.15.4 duckdb_1.0.0-2    
fansi_1.0.6       colorspace_2.1-1 
   [31] tools_4.4.1       pkgconfig_2.0.3  
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to