Dear R-users:

Hello all:

I'm having difficulty creating a new data frame, which would be a subset of an 
existing data frame, creaed by the random selection of subsets of observations 
based on different values of variables within the data frame. 

Here's an example of what my data frame looks like:

fact    x1      x2      x3      select...
blue    23      2.2     1.1     1
blue    28      4.2     0.8     0
blue    34      2.8     0.9     0
...
red     43      6.2     1.4     0       
red     33      5.2     1.5     1
red     35      4.2     1.6     1
...
green   22      3.5     1.1     0
green   21      4.5     1.3     0
green   33      6.5     1.7     0
green   12      4.4     1.9     0
...

There hundreds of different values (i.e., "colours") of the variable "fact" 
within my dataset, each of which has dozens of observations (that is, there are 
about 50 observations with the "fact" value blue, 45 with red, 87 with magenta, 
etc.).

I would like to end up with a new data frame, which is a subset of my original 
data frame. The new (subsetted) data frame would have the following 
characteristics:

1) It would retain all of the observations for which "select"==1
2) It would retain a random sample of the observations for which "select"==0, 
such that there is one randomly sampled observation within each set of 
observations for which "fact" is the same value, and whose "select" value==1.

Thus, in the above example, I would like to retain 
i) the first "blue" observation, and one additional randomly-selected "blue" 
observation for which select==0, 
ii) the 2nd and 3rd "red" observations, and two more randomly-selected "red" 
observations for which "select"==0, 
iii) none of the "green" observations, since none of these has a "select" value 
of 1.

So, the new data set would look something like this:

fact    x1      x2      x3      select
blue    23      2.2     1.1     1
blue    34      2.8     0.9     0
red     43      6.2     1.4     0       
red     33      5.2     1.5     1
red     35      4.2     1.6     1
red     28      4.4     1.4     0

Thank you for your help,
Josip


Josip Dasovic
Research Associate
Human Security Report Project
School of International Studies
Suite 7200
Simon Fraser University
515 West Hastings Street
Vancouver , BC
CANADA
V6B 5K3

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to