Thank you.. It was very informative and helpful. It works Sent from my iPhone
On Aug 5, 2012, at 10:21 PM, arun <smartpink...@yahoo.com> wrote: > HI, > > Try this: > dat1<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA)) > dat1[which(colMeans(is.na(dat1))<=.15)] > y > 1 NA > 2 13.53085 > 3 12.89453 > 4 15.02625 > 5 14.00387 > 6 15.34618 > 7 15.69293 > 8 15.62377 > 9 14.76479 > > #You can also use apply, sapply etc. > dat2<-data.frame(x=c(NA,NA,rnorm(6,15),NA),y=c(NA,rnorm(8,15)),z=c(rnorm(7,15),NA,NA),u=c(rnorm(9,15))) > dat2[apply(dat2,2,function(x) mean(is.na(x))<=.15)] > > #dat2[sapply(dat2,function(x) mean(is.na(x))<=.15)] > #dat2[which(colMeans(is.na(dat2))<=.15)] > > y u > 1 NA 14.56278 > 2 16.49940 16.25761 > 3 14.11368 14.08768 > 4 14.95139 14.01923 > 5 14.99517 15.91936 > 6 14.46359 14.07573 > 7 15.09702 13.94888 > 8 15.99967 14.97171 > 9 15.51924 15.59981 > > A.K. > > > > > > ----- Original Message ----- > From: Faz Jones <jonesf...@gmail.com> > To: r-help@r-project.org > Cc: > Sent: Sunday, August 5, 2012 9:04 PM > Subject: [R] deleting columns from a dataframe where NA is more than 15 > percent of the column length > > I have a dataframe of 10 different columns (length of each column is > the same). I want to eliminate any column that has 'NA' greater than > 15% of the column length. Do i first need to make a function for > calculating the percentage of NA for each column and then make another > dataframe where i apply the function? Whats the best way to do this. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.