EinMaulwurf opened a new issue, #45601:
URL: https://github.com/apache/arrow/issues/45601

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Arrow seems to have issues with labelled data, as it might come from STATA 
datasets.
   
   Filter labelled data in an arrow table does not work and throws an error. So 
far, so good, I can deal with that.
   ``` r
   library(haven)
   library(arrow)
   library(tibble)
   library(dplyr)
   
   d <- tibble(
     a = labelled(x = 1:5, label = "example variable a"),
     b = labelled(x = 11:15, label = "example variable b")
   )
   
   d
   #> # A tibble: 5 × 2
   #>   a         b        
   #>   <int+lbl> <int+lbl>
   #> 1 1         11       
   #> 2 2         12       
   #> 3 3         13       
   #> 4 4         14       
   #> 5 5         15
   
   d %>%
     as_arrow_table() %>%
     filter(a > 3) %>%
     collect()
   #> Error in `compute.arrow_dplyr_query()`:
   #> ! NotImplemented: Function 'greater' has no kernel matching input types 
(<labelled<integer>[0]>: example variable a, <labelled<integer>[0]>: example 
variable a)
   ```
   
   But when leaving out the final `collect()` to execute the query, the R 
session crashes completely:
   ```r
   d %>%
     as_arrow_table() %>%
     filter(a > 5)
   # R crashes....
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to