Hi:

Here's one way with package plyr:

df<-data.frame(x=c(rep(1,3),rep(2,4),rep(3,5)),
               y=rnorm(12),
               z=c(3,4,5,NA,NA,NA,NA,1,2,1,2,1),
               w=c(1,2,3,3,4,3,5,NA,5,NA,7,8))

library(plyr)
fun <- function(d) {
   u <- apply(d[, -1], 2, function(y) sum(is.na(y)))/nrow(d)
   if(all(u <= 0.5)) return(d)
  }

ddply(df, 'x', fun)
> ddply(df, 'x', fun)
  x           y z  w
1 1 -1.22768415 3  1
2 1  0.03108696 4  2
3 1  0.90246871 5  3
4 3 -0.47387908 1 NA
5 3  1.59577665 2  5
6 3 -0.80792438 1 NA
7 3  0.20927614 2  7
8 3 -0.46172477 1  8


On Mon, Feb 21, 2011 at 3:20 AM, D. Alain <dialva...@yahoo.de> wrote:

> Dear R-List,
>
> I have a dataframe with one grouping variable (x) and three response
> variables (y,z,w).
>
>
> df<-data.frame(x=c(rep(1,3),rep(2,4),rep(3,5)),y=rnorm(12),z=c(3,4,5,NA,NA,NA,NA,1,2,1,2,1),w=c(1,2,3,3,4,3,5,NA,5,NA,7,8))
>
> >df
>      x            y            z     w
>      1      0.29306106  3      1
>      1      0.54797780  4      2
>      1     -1.38365548  5      3
>      2     -0.20407986 NA    3
>      2     -0.87322574 NA    4
>      2     -1.23356250 NA    3
>      2      0.43929374 NA    5
>      3      1.16405483  1    NA
>      3      1.07083464  2     5
>      3     -0.67463191  1    NA
>      3     -0.66410552  2     7
>      3     -0.02543358  1     8
>
> Now I want to make a new dataframe df.sub comprising only cases pertaining
> to
>  groups, where the overall proportion of NAs in either of the response
> variables y,z,w does not exceed 50%.
>
> In the above example, e.g., this would be a dataframe with all cases of the
> groups 1 and 3 (since there are 100% NAs in z for group 2)
>
> >df.sub
>      x            y            z     w
>      1      0.29306106   3      1
>      1      0.54797780   4      2
>      1     -1.38365548   5      3
>       3      1.16405483   1    NA
>      3      1.07083464   2     5
>      3     -0.67463191   1    NA
>      3     -0.66410552   2     7
>      3     -0.02543358   1     8
>
> Please excuse me if the problem has already been treated somewhere, but so
> far I was not able to find the right threat for my question in RSeek.
>
> Can anyone help?
>
> Thanks in advance!
>
> D. Alain
>
>
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to