hadley wickham <h.wickham <at> gmail.com> writes:
> > library(plyr)
> > dat = data.frame(SUBJECT_ID=sample(letters[1:5],100,TRUE),HR=rnorm(100))
> > daply(dat,.(SUBJECT_ID),sd)
> > ddply(dat,.(SUBJECT_ID),sd)
>
> Well that calculates sd on the whole data frame. (Like sd(dat)).
Not really, it looks like the breakdown is somehow done:
> library(plyr)
> dat = data.frame(SUBJECT_ID=sample(letters[1:5],100,TRUE),HR=rnorm(100))
> daply(dat,.(SUBJECT_ID),sd)
SUBJECT_ID SUBJECT_ID HR
a NA 1.0488930
b NA 0.9110685
c NA 1.0776996
d NA 1.1724009
e NA 0.9455105
Warning messages:
1: In var(as.vector(x), na.rm = na.rm) : NAs introduced by coercion
..more warnings
> ddply(dat,.(SUBJECT_ID),sd)
SUBJECT_ID HR
1 NA 1.0488930
2 NA 0.9110685
3 NA 1.0776996
4 NA 1.1724009
5 NA 0.9455105
Warning messages:
1: In var(as.vector(x), na.rm = na.rm) : NAs introduced by coercion
That's what I meant by "almost correct". Your suggestion works, but wouldn't is
be a good default to make numcolwise(sd) the default with this close miss?
Dieter
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.