Hi Blaz, What do you do if the number of values sampled to be set missing (e.g., 4) is greater than the number of values for a given case that are less than your < 3 threshold? If no special considerations are needed for that, I do not see why you cannot apply the same technique you did below with MCAR to MNAR.
Best regards, Josh On Tue, Jun 7, 2011 at 12:17 AM, Blaz Simcic <blazsim...@yahoo.com> wrote: > Josh, > > thanks for the answer, it really helped me. I have another question, if you > maybe know how to do it. > > I would also like to sample number of missing values within selected cases, > as i did wit MCAR (see below). > > Can you help me tith this? > > Thanks, > > Blaz from Slovenia > > Here is my code for MCAR: > > N <- 1000 ####number of cases > > n <- 12 ####number of variables > > X <- matrix(rnorm(N * n), N, n) ####matrix > > pMiss <- 0.20 ####percent of missing values > > idMiss <- sample(1:N, N * pMiss) ####sample cases > > nMiss <- length(idMiss) > > m <- 3 ####maximum number of missing values within selected cases > > howmanyMiss <- sapply(idMiss, function(x) sample(1:m, 1)) > > howmanyMiss #### number of missing values within selected cases > > varMiss<-lapply(howmanyMiss, function(x) sample(1:n, x)) #### which > values are missing > > ids <- cbind(rep(idMiss, howmanyMiss), unlist(varMiss)) > > Xmiss <- X > > Xmiss[ids] <- NA > > Xmiss > > ________________________________ > From: Joshua Wiley <jwiley.ps...@gmail.com> > To: Blaz Simcic <blazsim...@yahoo.com> > Cc: r-help@r-project.org > Sent: Mon, June 6, 2011 10:34:38 PM > Subject: Re: [R] Not missing at random > > Hi Blaz, > > See below. > > x <- > matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4), > nrow = 7, ncol=7, byrow=TRUE) ####matrix > > pMiss <- 30 ####percent of missing values > > N <- dim(x)[1] ####number of cases > > candidate <- which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3 > | > x[,7]<3) #### I want to sample all cases with at least 1 value > lower than 3, so I have to find candidates > > ## easier to use this > ## find all x < 3 and return their row and column indices > ## select only row indices, and then find unique > candidate <- unique(which(x < 3, arr.ind = TRUE)[, "row"]) > > idMiss <- sample(candidate, N * pMiss / 100) #### I sampled cases > > ## from the subset of x cases that will be missing > ## find all that are < 3 and set to NA > x[idMiss, ][x[idMiss, ] < 3] <- NA > > ## If you are going to do this a lot, consider a function > nmar <- function(x, op = "<", value = 3, p = 30) { > op <- get(op) > candidate <- unique(which(op(x, value), arr.ind = TRUE)[, "row"]) > idMiss <- sample(candidate, nrow(x) * p / 100) > x[idMiss, ][op(x[idMiss, ], value)] <- NA > return(x) > } > > nmar(x) > > ## has the advantage that you can easily change > ## p, the cut off value, the operator (e.g., "<", ">", "<=", etc.) > > Cheers, > > Josh > > On Sun, Jun 5, 2011 at 11:17 PM, Blaz Simcic <blazsim...@yahoo.com> wrote: >> >> >> Hello! >> >> I would like to sample 30 % of cases (with at least 1 value lower than 3 - >> in >> the row) and among them I want to set all values lower than 3 (within >> selected >> cases) as NA (NMAR- Not missing at random). I managed to sample cases, but >> I >> don’t know how to set values (lower than 3) as NA. >> >> R code: >> >> x <- >> >> matrix(c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,3,3,3,4), >> nrow = 7, ncol=7, byrow=TRUE) ####matrix >> >> pMiss <- 30 ####percent of missing values >> >> N <- dim(x)[1] ####number of cases >> >> candidate<-which(x[,1]<3 | x[,2]<3 | x[,3]<3 | x[,4]<3 | x[,5]<3 | x[,6]<3 >> | >> x[,7]<3) #### I want to sample all cases with at least 1 value lower >> than 3, >> so I have to find candidates >> >> idMiss <- sample(candidate, N * p / 100) #### I sampled cases >> >> Now I'd like to set all values among sampled cases as NA. >> >> Any suggestion? >> >> Thanks, >> Blaž >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > > -- > Joshua Wiley > Ph.D. Student, Health Psychology > University of California, Los Angeles > http://www.joshuawiley.com/ > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.