Re: [R] Can I pass the grouped portions of a dataframe/tibble to a function in dplyr

Rui Barradas Sun, 05 Jul 2020 04:17:26 -0700

Hello,

I forgot to say I redid the data set setting the RNG seed first.




set.seed(2020)
n <- 50
x <- 1:n
y <- sample(1:3, n, replace = TRUE)
z <- rnorm(n)
tib <- tibble(x,y,z)


Also, don't do

as_tibble(cbind(...))
as.data.frame(cbind(...))

If one of the variables is of a different class (example, "character")all variables are coerced to the least common denominator. It's muchbetter to call tibble() or data.frame() directly.


Hope this helps,

Rui Barradas


Às 12:04 de 05/07/2020, Rui Barradas escreveu:

Hello,
You can pass a grouped tibble to a function with grouped_modify but thefunction must return a data.frame (or similar).
## this will also do it
#sillyFun <- function(tib){
#  tibble(nrow = nrow(tib), ncol = ncol(tib))
#}


sillyFun <- function(tib){
   data.frame(nrow = nrow(tib), ncol = ncol(tib)))
}

tib %>%
   group_by(y) %>%
   group_modify(~ sillyFun(.))
## A tibble: 3 x 3
## Groups:   y [3]
#      y  nrow  ncol
#  <dbl> <int> <int>
#1     1    17     2
#2     2    21     2
#3     3    12     2


Hope this helps,

Rui Barradas

Às 09:43 de 05/07/2020, Chris Evans escreveu:
Apologies if this is a stupid question but searching keeps gettingthings I know and don't need.
What I want to do is to use the group-by() power of dplyr to runfunctions that expect a dataframe/tibble per group but I can't see howdo it. Here is a reproducible example.
### create trivial tibble
n <- 50
x <- 1:n
y <- sample(1:3, n, replace = TRUE)
z <- rnorm(n)
tib <- as_tibble(cbind(x,y,z))

### create trivial function that expects a tibble/data frame
sillyFun <- function(tib){
return(list(nrow = nrow(tib),
ncol = ncol(tib)))
}

### works fine on the whole tibble
tib %>%
summarise(dim = list(sillyFun(.))) %>%
unnest_wider(dim)

That gives me:
# A tibble: 1 x 2
    nrow  ncol
   <int> <int>
1    50     3
### So I try the following hoping to apply the function to the groupedtibble
tib %>%
group_by(y) %>%
summarise(dim = list(sillyFun(.))) %>%
unnest_wider(dim)

### But that gives me:
# A tibble: 3 x 3
       y  nrow  ncol
   <dbl> <int> <int>
1     1    50     3
2     2    50     3
3     3    50     3
Clearly "." is still passing the whole tibble, not the groupedsubsets. What I can't find is whether there is an alternative to "."that would pass just the grouped subset of the tibble.
I have bodged my way around this by writing a function that takesindividual columns and reassembles them into a data frame that theactual functions I need to use require but that takes me back to a lotof clumsiness both selecting the variables to pass in the dplyr callto the function and putting the reassemble-to-data-frame bit in thefunction I call. (The functions I really need are reliabilityexplorations and can called on whole dataframes.)
I know I can do this using base R split and lapply but I feel sure itmust be possible to do this within dplyr/tidyverse. I'm slowlytransferring most of my code to the tidyverse and hitting frustrationsbut also finding that it does really help me program more sensibly,handle relational data structures more easily, and write code that Iseem better at reading when I come back to it after months on otherthings so I am slowly trying to move all my coding to tidyverse. If Icould see how to do this, it would help.
Very sorry if the answer should be blindingly obvious to me. I'd alsolove to have pointers to guidance to the tidyverse written for peoplewho aren't professional coders or statisticians and that go a bitbeyond the obvious basics of tidyverse into issues like this.
TIA,

Chris


--
Este e-mail foi verificado em termos de vírus pelo software antivírus Avast.
https://www.avast.com/antivirus

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can I pass the grouped portions of a dataframe/tibble to a function in dplyr

Reply via email to