Sorry, I didn't see Phil's reply, which is better than mine anyway. -Ista
On Fri, Feb 4, 2011 at 5:16 PM, Ista Zahn <iz...@psych.rochester.edu> wrote: > Hi, > > Please see ?na.action > > (just kidding!) > > So it seems to me the problem is that you are passing na.rm to the sum > function. So there is no missing data for the na.action argument to > operate on! > > Compare > > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.fail)$y) > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.pass)$y) > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.omit)$y) > > > Best, > Ista > > On Fri, Feb 4, 2011 at 4:07 PM, Gene Leynes <gleyne...@gmail.com> wrote: >> Can someone please tell me what is up with na.action in aggregate? >> >> My (somewhat) reproducible example: >> (I say somewhat because some lines wouldn't run in a separate session, more >> below) >> >> set.seed(100) >> dat=data.frame( >> x1=sample(c(NA,'m','f'), 100, replace=TRUE), >> x2=sample(c(NA, 1:10), 100, replace=TRUE), >> x3=sample(c(NA,letters[1:5]), 100, replace=TRUE), >> x4=sample(c(NA,T,F), 100, replace=TRUE), >> y=sample(c(rep(NA,5), rnorm(95)))) >> dat >> ## The total from dat: >> sum(dat$y, na.rm=T) >> ## The total from aggregate: >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y) ## <--- This line >> gave an error in a separate R instance >> ## The aggregate formula is excluding NA >> >> ## So, let's try to include NAs >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='na.pass')$y) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action=na.pass)$y) >> ## The aggregate formula is STILL excluding NA >> ## In fact, the formula doesn't seem to notice the na.action >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='foo man >> chew')$y) >> ## Hmmmm... that error surprised me (since the previous two things ran) >> >> ## So, let's try to change the global options >> ## (not mentioned in the help, but after reading the help >> ## 100 times, I thought I would go above and beyond to avoid >> ## any r list flames from people complaining >> ## that I didn't read the help... but that's a separate topic) >> options(na.action ="na.pass") >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='na.pass')$y) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action=na.pass)$y) >> ## (NAs are still omitted) >> >> ## Even more frustrating... >> ## Why don't any of these work??? >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.pass')$x) >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.pass)$x) >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.omit')$x) >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.omit)$x) >> >> >> ## This does work, but in my real data set, I want NA to really be NA >> for(j in 1:4) >> dat[is.na(dat[,j]),j] = 'NA' >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x) >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y) >> >> >> ## My first session info >> # >> #> sessionInfo() >> #R version 2.12.0 (2010-10-15) >> #Platform: i386-pc-mingw32/i386 (32-bit) >> # >> #locale: >> # [1] LC_COLLATE=English_United States.1252 >> #[2] LC_CTYPE=English_United States.1252 >> #[3] LC_MONETARY=English_United States.1252 >> #[4] LC_NUMERIC=C >> #[5] LC_TIME=English_United States.1252 >> # >> #attached base packages: >> # [1] stats graphics grDevices utils datasets methods >> base >> # >> #other attached packages: >> # [1] plyr_1.2.1 zoo_1.6-4 gdata_2.8.1 rj_0.5.0-5 >> # >> #loaded via a namespace (and not attached): >> # [1] grid_2.12.0 gtools_2.6.2 lattice_0.19-13 rJava_0.8-8 >> #[5] tools_2.12.0 >> >> >> >> I tried running that example in a different version of R, with and I got >> completely different results >> >> The other version of R wouldn't recognize the formula at all.. >> >> My other version of R: >> >> # My second session info >> #> sessionInfo() >> #R version 2.10.1 (2009-12-14) >> #i386-pc-mingw32 >> # >> #locale: >> # [1] LC_COLLATE=English_United States.1252 >> #[2] LC_CTYPE=English_United States.1252 >> #[3] LC_MONETARY=English_United States.1252 >> #[4] LC_NUMERIC=C >> #[5] LC_TIME=English_United States.1252 >> # >> #attached base packages: >> # [1] stats graphics grDevices utils datasets methods >> base >> #> >> # >> >> PS: Also, I have read the help on aggregate, factor, as.factor, and several >> other topics. If I missed something, please let me know. >> Some people like to reply to questions by telling the sender that R has >> documentation. Please don't. The R help archives are littered with >> reminders, friendly and otherwise, of R's documentation. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Ista Zahn > Graduate student > University of Rochester > Department of Clinical and Social Psychology > http://yourpsyche.org > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.