Hello R users! I have several data frames where some of the variables have many missing observations. For example, Q1 in one of my data frames has over 66% of its observations missing. I have tried imputation with mice but it does not work for all the data frames and I get the following message or a similar message to this: iter imp variable 1 1 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q19 Q36 Q47 Q52 Q79 Q80 Q94 Q97 Q104 Q108 Q122 Q131 Q134 P1 P2 P3 P4 P5 P6Error in solve.default(xtx + diag(pen)) : system is computationally singular: reciprocal condition number = 1.83044e-16 In addition: Warning messages: 1: In sqrt((sum(residuals^2))/(sum(ry) - ncol(x) - 1)) : NaNs produced ... 7: In sqrt((sum(residuals^2))/(sum(ry) - ncol(x) - 1)) : NaNs produced Note: warnings 2 to 6 suppressed by me. I would like to try a different approach where I delete the variables that have more than 50% missing observations from the data frame (well, the actual percentage might change). I have already deleted from the data frame the variables that were all missing and for this I used the following code, which was kindly suggested by one of you: ## Data frame after removing any blank columns:dfQ <- dfQtemp[ , sapply(dfQtemp, function(x) !all(is.na(x)))] Any ideas or suggestons for deleting variables with partially missing data? Thanks and have a great weekend! Rita ===================================== "If you think education is expensive, try ignorance."--Derek Bok
[[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.