On 1/17/21 12:15 PM, Bernard McGarvey wrote:
I have a data frame that consists of several factor columns say A, B, C, D, and
E and several columns containing numerical data, say X1, X2, .... X10. I would
like to create statistics of some of the numerical columns by some of the
factor columns. For example,
Calculate the mean, min, and max of variables X1 and X7, by factors A, and E.
The results should look like the table below:
Factor A Factor E mean(X1) min(x1) max(X1) mean(X7) min(x7) max(X7)
mean(X10) min(x10) max(X10)
A1 E1
A1 E2
A1 E3
A2 E1
A2 E2
A2 E3
I would like the results to be returned to a data frame or other object that I
can write out using the write.csv function. I have looked at the summarize and
numSummary functions but they do not appear to be flexible enough to do the
above.
The `aggregate` function will do the subsetting and function application.
> dfrm <- cbind(dfrm, matrix(rnorm(600), ncol=10 ) ); names(dfrm)[3:12]
<- paste0("X", 1:10)
> str(dfrm)
'data.frame': 60 obs. of 12 variables:
$ Factor_A: Factor w/ 2 levels "A1","A2": 1 1 1 2 2 2 1 1 1 2 ...
$ Factor_B: Factor w/ 3 levels "E1","E2","E3": 1 2 3 1 2 3 1 2 3 1 ...
$ X1 : num -0.02116 -0.00049 0.12875 -0.05412 0.51886 ...
$ X2 : num 1.6799 -0.0963 -0.5727 -0.3638 -0.322 ...
$ X3 : num -0.349 0.267 -0.666 -0.329 0.902 ...
$ X4 : num 0.1125 -0.5384 0.0924 0.6849 -0.4194 ...
$ X5 : num -0.421 0.372 1.316 1.323 -0.03 ...
$ X6 : num -0.0767 1.4972 0.1967 -0.7092 -1.0943 ...
$ X7 : num 0.1771 -0.2136 -1.0818 -0.0671 2.0015 ...
$ X8 : num 1.456 -0.383 -0.47 0.965 0.569 ...
$ X9 : num -1.795 -0.4546 0.0069 1.2245 -0.395 ...
$ X10 : num -1.931 1.708 0.274 0.73 -0.995 ...
aggregate( dfrm[ , c("X1", "X7", "X10")], # columns to analyze
dfrm[ c("Factor_A", "Factor_B")], # classifying
columns
FUN=function (x) c(mn =mean(x), min=min(x),
max=max(x) ) ) # desired "summarizers"
#--- result----
Factor_A Factor_B X1.mn X1.min X1.max X7.mn
X7.min X7.max
1 A1 E1 0.187513792 -0.866094155 2.310960164 0.22489729
-0.91442493 1.94095786
2 A2 E1 0.078361707 -1.515410191 1.382420050 -0.51309155
-1.67026123 0.70869034
3 A1 E2 -0.267416858 -1.995131138 1.392115793 -0.04772929
-2.45426692 2.02225946
4 A2 E2 -0.069807208 -0.703073589 1.879448658 -0.37770923
-2.66221239 2.00152154
5 A1 E3 -0.007800886 -1.297561250 1.216627848 -0.30395411
-1.08181218 1.09764895
6 A2 E3 -0.054466856 -1.577891927 1.674719118 0.35594015
-1.20865279 2.25765422
X10.mn X10.min X10.max
1 -0.3458888 -2.0312811 1.1483179
2 -0.1021727 -1.3230372 0.8045472
3 0.3514645 -3.2334010 1.7075298
4 -0.4988984 -2.1091311 0.5857192
5 0.2297461 -1.1336967 0.8483935
6 0.3700621 -1.5609424 2.2792024
--
David
Any help would be appreciated,
Thanks
Bernard McGarvey
Director, Fort Myers Beach Lions Foundation, Inc.
Retired (Lilly Engineering Fellow).
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.