On Sat, 2009-05-09 at 08:23 -0400, Gabor Grothendieck wrote: > Try this: > > > aggregate(dat["A"], dat["Group"], mean) > Group A > 1 1 0.4944810 > 2 2 0.4765412 > 3 3 0.4521068 > 4 4 0.4989000
Thanks Gabor. Ideally, aggregate.default should "work" whatever indexing one uses - here you are using the fact that a data.frame is a special case of a list, and is not the way most help resources introduce subsetting for data frames. For personal use, I can use my own version of aggregate.default and as I dislike using `$`, prefering with(), I don't run the risk of non syntactic names being produced. I was really looking for ideas for improving aggregate.default in general. The solution I posted has its own infelicities... Cheers, G > > On Sat, May 9, 2009 at 8:14 AM, Gavin Simpson <gavin.simp...@ucl.ac.uk> wrote: > > Hi, > > > > I find it a bit annoying that aggregate.default forces the returned > > object to loose the 'name' of the variable aggregated, replacing it with > > 'x'. > > > > A brief example: > > > >> dat <- data.frame(A = runif(100), B = rnorm(100), > > + Group = gl(4, 25)) > >> with(dat, aggregate(A, by = list(Group = Group), FUN = mean)) > > Group x > > 1 1 0.6523228 > > 2 2 0.4544317 > > 3 3 0.4619624 > > 4 4 0.4703156 > > > > This arises because aggregate default has: > > > > function (x, ...) > > { > > if (is.ts(x)) > > aggregate.ts(as.ts(x), ...) > > else aggregate.data.frame(as.data.frame(x), ...) > > } > > > > which recasts x as a data frame, but doesn't make any effort to supply a > > name. Can we do a better job of supplying a useful name? > > > > My first attempt is: > > > > aggregate.default <- function(x, ...) { > > if (is.ts(x)) > > aggregate.ts(as.ts(x), ...) > > else { > > nam <- deparse(substitute(x)) > > x <- as.data.frame(x) > > names(x) <- nam > > aggregate.data.frame(x, ...) > > } > > } > > > > Which works for the brief example above: > > > >> with(dat, aggregate(A, by = list(Group = Group), FUN = mean)) > > Group A > > 1 1 0.4269715 > > 2 2 0.5479352 > > 3 3 0.5091543 > > 4 4 0.4926412 > > > > However, it fails make check-all because examples have relied on > > returned object having 'x'. I also note that this might have the > > annoying side effect of producing odd names if we use the following > > incantation: > > > >> res <- aggregate(dat$A, by = list(Group = dat$Group), FUN = mean) > >> str(res) > > 'data.frame': 4 obs. of 2 variables: > > $ Group: Factor w/ 4 levels "1","2","3","4": 1 2 3 4 > > $ dat$A: num 0.427 0.548 0.509 0.493 > >> res$dat$A > > Error in res$dat$A : $ operator is invalid for atomic vectors > >> res$`dat$A` > > [1] 0.4269715 0.5479352 0.5091543 0.4926412 > > > > Is there a way of coming up with a better way to name the aggregated > > variable? Would a change of this kind be something R Core would consider > > making to aggregate.default if a good solution is found? > > > > Thanks in advance, > > > > G > > -- > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > Dr. Gavin Simpson [t] +44 (0)20 7679 0522 > > ECRC, UCL Geography, [f] +44 (0)20 7679 0565 > > Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk > > Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ > > UK. WC1E 6BT. [w] http://www.freshwaters.org.uk > > %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% > > > > ______________________________________________ > > R-devel@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel