I think there are multiple solutions that match your criteria. Here is one:
dat <- structure(list(Id = 1:20, v1 = c(1L, 2L, 4L, 1L, 3L, 3L, 3L, + 4L, 1L, 4L, 2L, 1L, 2L, 4L, 3L, 2L, 1L, 2L, 4L, 3L), v2 = c(2L, + 1L, 2L, 1L, 2L, 1L, 4L, 4L, 2L, 1L, 4L, 4L, 3L, 3L, 2L, 3L, 4L, + 3L, 1L, 3L), v3 = c(4L, 3L, 4L, 2L, 3L, 1L, 3L, 4L, 2L, 1L, 3L, + 2L, 3L, 1L, 1L, 2L, 1L, 4L, 4L, 2L), v4 = c(3L, 4L, 2L, 3L, 4L, + 1L, 1L, 4L, 1L, 2L, NA, 3L, 4L, NA, 2L, 3L, 4L, 3L, 1L, 1L)), .Names = c("Id", + "v1", "v2", "v3", "v4"), class = "data.frame", row.names = c(NA, + -20L)) > keep <- rowSums(apply(dat[,-1], 2, function(x) !duplicated(x))) > dat.sub <- dat[keep > 0 ,] Best, Ista On Sun, Jan 23, 2011 at 12:43 PM, Wei Yang <peterwya...@gmail.com> wrote: > Dear all, > > I would like to ask whether anyone has experience with the problem below. > > > I want to select a subset of the sample (see data below) so that each level > (1,2,3,4 in the example) for every variable (v1,v2,v3,v4 in the example) is > shown at least once in the subset. I also want the sample size of the > subset to be as small as possible. Any help on it is greatly appreciated. > > > Id v1 v2 v3 v4 > > [1,] 1 1 2 4 3 > > [2,] 2 2 1 3 4 > > [3,] 3 4 2 4 2 > > [4,] 4 1 1 2 3 > > [5,] 5 3 2 3 4 > > [6,] 6 3 1 1 1 > > [7,] 7 3 4 3 1 > > [8,] 8 4 4 4 4 > > [9,] 9 1 2 2 1 > > [10,] 10 4 1 1 2 > > [11,] 11 2 4 3 2 > > [12,] 12 1 4 2 3 > > [13,] 13 2 3 3 4 > > [14,] 14 4 3 1 2 > > [15,] 15 3 2 1 2 > > [16,] 16 2 3 2 3 > > [17,] 17 1 4 1 4 > > [18,] 18 2 3 4 3 > > [19,] 19 4 1 4 1 > > [20,] 20 3 3 2 1 > > > > Thanks, > > Peter > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.