Any suggestions on the following would be grateful.

I'm trying to impute data, where a fictitional dataset is defined as...

set.seed(110)
n <- 500
test <- data.frame(smoke_status = rbinom(n, 2, 0.6), smoke_amount = rbinom(n, 2, 0.5), rf1 = rnorm(n), rf2 = rnorm(n), outcome = rbinom(n, 1, 0.3))

# smoke_status (0, 1, 2) is c("non-smoker, "ex-smoker", "current_smoker"), and
# smoke_amount (0, 1, 2) is c("light", "moderate", "heavy")
# rf1 and rf2 are two other risk factors (for illustration purposes - real data set has more risk factors)

# artificially NA some of these values
test$smoke_status[sample(1:nrow(test), 60)] <- NA
test$smoke_amount[sample(1:nrow(test), 60)] <- NA
test$rf1[sample(1:nrow(test), 50)] <- NA
test$rf2[sample(1:nrow(test), 50)] <- NA

I'm trying to impute all missing values, but I only want to impute smoke_amount if smoke_status==2 (i.e. they are a current smoker - makes no sense to impute smoke_amount if they do not smoke). I can do this in STATA via the conditional option in ICE, but would prefer to keep this in R. Any suggestions (if this is feasible via MICE, mi or Amelia)? I thought the passive imputation approach in MICE would be the way forward but I've so far been unsuccessful.

Thanks in advance.

Gary

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to