This inconsistency recently came to my attention: > df <- data.frame(A = 1:10, B = rnorm(10)) > min(df) [1] -1.768958 > max(df) [1] 10 > mean(df) [1] NA Warning message: In mean.default(df) : argument is not numeric or logical: returning NA
I recall the times where `mean(df)` would give `colMeans(df)` and this behaviour was deemed inconsistent. It seems though that the change has removed one inconsistency and replaced it with another. Am I missing good reasons why there couldn't be a `mean.data.frame()` method which worked like `max()` etc when given a data frame? Namely that they return the required statistic *only* when presented with a data frame of all numeric variables? E.g. > df <- data.frame(A = 1:10, B = rnorm(10), C = factor(rep(c("A","B"), each = 5))) > max(df) Error in FUN(X[[1L]], ...) : only defined on a data frame with all numeric variables I would expect `mean(df)` to fail with the same error as for `max(df)` with the new example `df` but for would return the same as `mean(as.matrix(df))` or `mean(colMeans(df))` if given an entirely numeric data frame: > mean(colMeans(df[, 1:2])) [1] 2.78366 > mean(as.matrix(df[, 1:2])) [1] 2.78366 > mean(df[,1:2]) [1] 2.78366 I just can't see the sense in having `mean` work the way it does now? Thanks, Gavin -- Gavin Simpson, PhD [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel