Hi,
I have the following data.
> set.seed(6245)
> data <- data.frame(x1=rnorm(5),x2=rnorm(5),x3=rnorm(5),x4=rnorm(5))
> round(data,digits=3)
x1 x2 x3 x4
1 0.482 1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063 0.886
3 0.028 -0.256 -0.069 0.354
4 -0.086 0.475 0.244 0.781
5 0.690 -0.181 1.274 1.633
What I would like to do is drop 20% of the data. But I want this 20% to
only come from dropping data from x3 and x4. It doesn't have to be evenly,
i.e. I don't care to drop 2 from x3 and 2 from x4 or make sure only one
observation has missing data on only one variable. I just want to drop 20%
of the data through x3 and x4 only. In other words,
x1 x2 x3 x4
1 0.482 1.320 -0.859 NA
2 -0.753 -0.041 -0.063 0.886
3 0.028 -0.256 NA 0.354
4 -0.086 0.475 NA 0.781
5 0.690 -0.181 NA 1.633
OR
x1 x2 x3 x4
1 0.482 1.320 NA -0.142
2 -0.753 -0.041 -0.063 0.886
3 0.028 -0.256 NA NA
4 -0.086 0.475 0.244 NA
5 0.690 -0.181 1.274 1.633
OR
x1 x2 x3 x4
1 0.482 1.320 -0.859 -0.142
2 -0.753 -0.041 -0.063 NA
3 0.028 -0.256 -0.069 NA
4 -0.086 0.475 0.244 NA
5 0.690 -0.181 1.274 NA
ETC. are all fine.
Any ideas how I can do this?
Chris
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.