On 31-Jan-11 04:17:45, David Winsemius wrote: > > On Jan 30, 2011, at 10:01 PM, assaedi76 assaedi76 wrote: > >> R users: >> Thanks in advance >> How to generate missing at random (MAR)?
[DW]: missidx <- sample(1:nrow(dfrm), nrow(dfrm)*frac) is.na(dfrm$measure) <- 1:nrow(dfrm) %in% missidx >> assaed...@yahoo.com >> Thanks That solution is for (in "missing data language") MCAR (Missing Completely At Random), i.e. the probability of being missing does not depend on any of the variables in the data. For MAR (Missing At Random), the probability of being missing may depend on the values of covariates but must not depend on the value of the outcome variable. So the way to generate MAR, for data where there are covariates X1, X2, ... , Xk (and outcome Y) is to set up a function P (could be anything) of some or all of X1, X2, ... , Xk taking values in [0,1] (endpoints included), and then set a "missing" variable Z to be 0 (not missing) or 1 (missing) with probability given by the value of Z for that case. So, if M is a data matrix with columns X1, ... , Xk , Y where each row is a case, use apply() to evaluate the function P() for each row in terms of (X1,X2,...,Xk). You then get a vector p = c(p.1, p.2, ... , p.N) of values of P for the N rows of M. At this point: Z <- 1*( runif(N) <= p ) creates a vectors of 0s and 1s which will be markers of Missing At Random. Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <ted.hard...@wlandres.net> Fax-to-email: +44 (0)870 094 0861 Date: 31-Jan-11 Time: 10:17:20 ------------------------------ XFMail ------------------------------ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.