Hello,
I would like to please ask for assistance with aggregate sum.  I have a data 
set with consisting of two grouping variables (id, visit) and several other 
variables.  I would like to sum the variables for each id and visit, but am 
having problems with na.rm.  na.rm=TRUE seems to replace all NAs with zeros, or 
better stated results in a zero when summing a set of NAs.  I would like to 
remove NAs when some NAs are present in a group (1+NA + NA =1 or NA + 1 +1=2), 
but retain/keep the NA if the entire group consists of NAs (NA + NA + NA=NA).  

I have created an truncated example (my data set has many more rows):

example <-
structure(list(id = c(4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 43L, 43L, 43L, 43L, 43L, 43L, 43L, 43L, 43L, 43L, 
43L, 43L), visit = c(3L, 3L, 3L, 3L, 5L, 5L, 5L, 9L, 9L, 9L, 
9L, 12L, 12L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 9L, 9L, 12L, 12L, 
12L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 12L, 12L, 12L, 12L), var1 = c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L), var2 = c(1L, NA, NA, NA, NA, 1L, NA, 0L, 
1L, 0L, NA, 0L, NA, NA, NA, 1L, 0L, NA, 0L, 1L, 1L, 1L, NA, 1L, 
0L, NA, NA, 1L, 1L, 0L, NA, 1L, NA, NA, 0L, 1L, 1L, 1L), var3 = c(NA, 
NA, NA, NA, 1L, 0L, 1L, NA, 1L, 1L, NA, 0L, 1L, NA, 0L, 1L, NA, 
1L, 0L, NA, 1L, 0L, 1L, NA, NA, NA, NA, 1L, 1L, 1L, 1L, 0L, 1L, 
NA, NA, NA, NA, 1L), var4 = c(0L, 1L, NA, NA, NA, 1L, 1L, 0L, 
NA, NA, 0L, 1L, 1L, NA, 1L, 1L, 0L, 1L, 1L, NA, 1L, NA, 0L, 0L, 
0L, NA, NA, NA, NA, NA, NA, 1L, 1L, NA, 0L, 0L, 0L, NA)), .Names = c("id", 
"visit", "var1", "var2", "var3", "var4"), class = "data.frame", row.names = 
c(NA, 
-38L))
example<-as.data.frame(example)

#generates 0s for groups with all NAs such as id 43, visit 3, var4 that I would 
like to be NA
agex1 <-aggregate.data.frame(example, 
by=list(example$id,example$visit),FUN=sum,na.rm=TRUE)

#discards sums with any NAs in it, including many data that I would like to 
analyze, too many NAs
agex2<-aggregate.data.frame(example, by=list(example$id,example$visit),FUN=sum)

na.action does not seem to work with data frames in this instance.  I have 
tried to create a function to fix this, but have had great difficulty.  I have 
thought about ddply but cannot figure out how to apply this.  Would anyone be 
able to please suggest an alternate means of summing by group to retain NAs 
when I would like but not when they are part of an entire set of NAs?  I would 
be very grateful for a suggestion for an alternate way to process these data.

Thanks, Matt

This electronic transmission may contain information that is privileged, 
confidential and exempt from disclosure under applicable law. If you are not 
the intended recipient, please notify me immediately as use of this information 
is strictly prohibited.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to