Thank you both for the thoughtful (and funny) replies.

I agree with both of you that sum is the one picking up aggregate.  Although
I didn't mention it, I did realize that in the first place.
Also, thank you Phil for pointing out that aggregate only accepts a formula
value in more recent versions!  I actually thought that was an older
feature, but I must be thinking of other functions.

I still don't see why these two values are not the same!

It seems like a bug to me

> set.seed(100)
> dat=data.frame(
+         x1=sample(c(NA,'m','f'), 100, replace=TRUE),
+         x2=sample(c(NA, 1:10), 100, replace=TRUE),
+         x3=sample(c(NA,letters[1:5]), 100, replace=TRUE),
+         x4=sample(c(NA,T,F), 100, replace=TRUE),
+         y=sample(c(rep(NA,5), rnorm(95))))
> sum(dat$y, na.rm=T)
*[1] 0.0815244116598*
> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.pass, na.rm=T)$y)
*[1] -4.45087666247*
>



On Fri, Feb 4, 2011 at 4:18 PM, Ista Zahn <iz...@psych.rochester.edu> wrote:

> Sorry, I didn't see Phil's reply, which is better than mine anyway.
>
> -Ista
>
> On Fri, Feb 4, 2011 at 5:16 PM, Ista Zahn <iz...@psych.rochester.edu>
> wrote:
> > Hi,
> >
> > Please see ?na.action
> >
> > (just kidding!)
> >
> > So it seems to me the problem is that you are passing na.rm to the sum
> > function. So there is no missing data for the na.action argument to
> > operate on!
> >
> > Compare
> >
> > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.fail)$y)
> > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.pass)$y)
> > sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.action=na.omit)$y)
> >
> >
> > Best,
> > Ista
> >
> > On Fri, Feb 4, 2011 at 4:07 PM, Gene Leynes 
> > <gleyne...@gmail.com<gleynes%...@gmail.com>>
> wrote:
> >> Can someone please tell me what is up with na.action in aggregate?
> >>
> >> My (somewhat) reproducible example:
> >> (I say somewhat because some lines wouldn't run in a separate session,
> more
> >> below)
> >>
> >> set.seed(100)
> >> dat=data.frame(
> >>        x1=sample(c(NA,'m','f'), 100, replace=TRUE),
> >>        x2=sample(c(NA, 1:10), 100, replace=TRUE),
> >>        x3=sample(c(NA,letters[1:5]), 100, replace=TRUE),
> >>        x4=sample(c(NA,T,F), 100, replace=TRUE),
> >>        y=sample(c(rep(NA,5), rnorm(95))))
> >> dat
> >> ## The total from dat:
> >> sum(dat$y, na.rm=T)
> >> ## The total from aggregate:
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)  ## <--- This
> line
> >> gave an error in a separate R instance
> >> ## The aggregate formula is excluding NA
> >>
> >> ## So, let's try to include NAs
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T,
> na.action='na.pass')$y)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T,
> na.action=na.pass)$y)
> >> ## The aggregate formula is STILL excluding NA
> >> ## In fact, the formula doesn't seem to notice the na.action
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T, na.action='foo man
> >> chew')$y)
> >> ## Hmmmm... that error surprised me (since the previous two things ran)
> >>
> >> ## So, let's try to change the global options
> >> ## (not mentioned in the help, but after reading the help
> >> ##  100 times, I thought I would go above and beyond to avoid
> >> ##  any r list flames from people complaining
> >> ##  that I didn't read the help... but that's a separate topic)
> >> options(na.action ="na.pass")
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T,
> na.action='na.pass')$y)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T,
> na.action=na.pass)$y)
> >> ## (NAs are still omitted)
> >>
> >> ## Even more frustrating...
> >> ## Why don't any of these work???
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.pass')$x)
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.pass)$x)
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action='na.omit')$x)
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T, na.action=na.omit)$x)
> >>
> >>
> >> ## This does work, but in my real data set, I want NA to really be NA
> >> for(j in 1:4)
> >>    dat[is.na(dat[,j]),j] = 'NA'
> >> sum(aggregate(dat$y, dat[,1:4], sum, na.rm=T)$x)
> >> sum(aggregate(y~x1+x2+x3+x4, data=dat, sum, na.rm=T)$y)
> >>
> >>
> >> ## My first session info
> >> #
> >> #> sessionInfo()
> >> #R version 2.12.0 (2010-10-15)
> >> #Platform: i386-pc-mingw32/i386 (32-bit)
> >> #
> >> #locale:
> >> #        [1] LC_COLLATE=English_United States.1252
> >> #[2] LC_CTYPE=English_United States.1252
> >> #[3] LC_MONETARY=English_United States.1252
> >> #[4] LC_NUMERIC=C
> >> #[5] LC_TIME=English_United States.1252
> >> #
> >> #attached base packages:
> >> #        [1] stats     graphics  grDevices utils     datasets  methods
> >> base
> >> #
> >> #other attached packages:
> >> #        [1] plyr_1.2.1  zoo_1.6-4   gdata_2.8.1 rj_0.5.0-5
> >> #
> >> #loaded via a namespace (and not attached):
> >> #        [1] grid_2.12.0     gtools_2.6.2    lattice_0.19-13 rJava_0.8-8
> >> #[5] tools_2.12.0
> >>
> >>
> >>
> >> I tried running that example in a different version of R, with and I got
> >> completely different results
> >>
> >> The other version of R wouldn't recognize the formula at all..
> >>
> >> My other version of R:
> >>
> >> #  My second session info
> >> #> sessionInfo()
> >> #R version 2.10.1 (2009-12-14)
> >> #i386-pc-mingw32
> >> #
> >> #locale:
> >> #        [1] LC_COLLATE=English_United States.1252
> >> #[2] LC_CTYPE=English_United States.1252
> >> #[3] LC_MONETARY=English_United States.1252
> >> #[4] LC_NUMERIC=C
> >> #[5] LC_TIME=English_United States.1252
> >> #
> >> #attached base packages:
> >> #        [1] stats     graphics  grDevices utils     datasets  methods
> >> base
> >> #>
> >> #
> >>
> >> PS: Also, I have read the help on aggregate, factor, as.factor, and
> several
> >> other topics.  If I missed something, please let me know.
> >> Some people like to reply to questions by telling the sender that R has
> >> documentation.  Please don't.  The R help archives are littered with
> >> reminders, friendly and otherwise, of R's documentation.
> >>
> >>        [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
> >
> > --
> > Ista Zahn
> > Graduate student
> > University of Rochester
> > Department of Clinical and Social Psychology
> > http://yourpsyche.org
> >
>
>
>
> --
> Ista Zahn
> Graduate student
> University of Rochester
> Department of Clinical and Social Psychology
> http://yourpsyche.org
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to