How about this: > Y <- as.data.frame(matrix(c("c","d",NA,4),2,2), stringsAsFactors=FALSE) > X <- as.data.frame(matrix(c("a","b",1,2),2,2), stringsAsFactors=FALSE) > Y V1 V2 1 c <NA> 2 d 4 > X V1 V2 1 a 1 2 b 2 > Y[] <- lapply(seq(ncol(Y)), function(.col){ + ifelse(is.na(Y[,.col]), X[,.col], Y[,.col]) + }) > > Y V1 V2 1 c 1 2 d 4 >
On Thu, Jan 22, 2009 at 10:44 PM, Mike Miller <mbmil...@taxa.epi.umn.edu> wrote: > On Thu, 22 Jan 2009, Mike Miller wrote: > >> Suppose X and Y are two data frames with the same structures, variable >> names and dimensions but with different data and different patterns of >> missing. I want to replace missing values in Y with corresponding values >> from X. I'll construct a simple two-by-two case: >> >>> X <- as.data.frame(matrix(c("a","b",1,2),2,2), stringsAsFactors=FALSE) >>> X[,2] <- as.integer(X[,2]) >>> str(X) >> >> 'data.frame': 2 obs. of 2 variables: >> $ V1: chr "a" "b" >> $ V2: int 1 2 >> >>> Y <- as.data.frame(matrix(c("c","d",NA,4),2,2), stringsAsFactors=FALSE) >>> Y[,2] <- as.integer(Y[,2]) >>> str(Y) >> >> 'data.frame': 2 obs. of 2 variables: >> $ V1: chr "c" "d" >> $ V2: int NA 4 >> >> This seems to be what I want to do... >> >>> Y[is.na(Y)] <- X[is.na(Y)] >> >> ...and it works except that the structure of Y is changed so that Y$V2 is >> now of type chr instead of type int: >> >>> str(Y) >> >> 'data.frame': 2 obs. of 2 variables: >> $ V1: chr "c" "d" >> $ V2: chr "1" "4" > > > I figured out a good answer. We can just decide the list of columns we want > to work with and then use a for loop. This avoids problems with changing > variable types: > > cols <- 38:47 > keep <- is.na(Y) > for (i in cols) { nas <- which(keep[,i]); if ( length(nas) > 0 ) { Y[nas,i] > <- X[nas,i] }} > > Something like that makes for a good one-liner on the interactive command > line, but this looks neater in a script: > > cols <- 38:47 > keep <- is.na(Y) > for (i in cols) { > nas <- which(keep[,i]) > if ( length(nas) > 0 ) { > Y[nas,i] <- X[nas,i] > } > } > > It shouldn't be too hard to write a function that does that kind of thing. > > The only problem I know of is that if X and Y don't have exactly the same > levels for factors, if there are factors, there could be problems. It would > probably take a few more lines to deal with this > > A couple of people wrote to me with helpful suggestions, but no one had a > really great, established kind of solution. I'm a little surprised. But, > with an average of 125 messages per day (!) on this list, I shouldn't be > surprised that a long message like this one won't be read by everyone. > > Best, > Mike > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.