Re: [R] How do I delete multiple blank variables from a data frame?

Rita Carreira Mon, 21 Mar 2011 13:03:04 -0700

Allan and Josh,
Thanks for the help. I did  

d[, sapply(d, function(x) !all(is.na(x)))]


and it worked great.
Thanks so much again!


Rita
________________________________________

"If you think education is expensive, try ignorance"--Derek Bok




> Date: Sat, 19 Mar 2011 08:36:43 +0000
> From: all...@cybaea.com
> To: jwiley.ps...@gmail.com
> CC: ritacarre...@hotmail.com; r-help@r-project.org
> Subject: Re: [R] How do I delete multiple blank variables from a data frame?
> 
> 
> 
> On 19/03/11 01:35, Joshua Wiley wrote:
> > Hi Rita,
> >
> > This is far from the most efficient or elegant way, but:
> >
> > ## two column data frame, one all NAs
> > d<- data.frame(1:10, NA)
> > ## use apply to create logical vector and subset d
> > d[, apply(d, 2, function(x) !all(is.na(x)))]
> 
> This works, but apply converts d to a matrix which is not needed, so try
> 
> d[, sapply(d, function(x) !all(is.na(x)))]
> 
> 
> if performance is an issue (apply is about 3x slower on your test data 
> frame d, more for larger data frames).
> 
> For the related problem of removing columns of constant-or-na values, 
> the best I could come up with is
> 
> zv.1 <- function(x) {
>      ## The literal approach
>      y <- var(x, na.rm = TRUE)
>      return(is.na(y) || y == 0)
> }
> sapply(train, zv.1)
> 
> See 
> http://www.cybaea.net/Blogs/Data/R-Eliminating-observed-values-with-zero-variance.html
>  
> for the benchmarks.
> 
> Allan
> 
> 
> > I am just apply()ing to each column (the 2) of d, the function
> > !all(is.na(x)) which will return FALSE if all of x is missing and TRUE
> > otherwise.  The result is a logical vector the same length as the
> > number of columns in d that is used to subset only the d columns with
> > at least some non-missing values.  For documentation see:
> >
> > ?apply
> > ?is.na
> > ?all
> > ?"["
> > ?Logic
> >
> > HTH,
> >
> > Josh
> >
> > On Fri, Mar 18, 2011 at 3:35 PM, Rita Carreira<ritacarre...@hotmail.com>  
> > wrote:
> >> Dear List Members,I have 55 data frames, each of which with 272 variables 
> >> and 267 observations. Some of these variables are blanks but the blanks 
> >> are not the same for every data frame. I would like to write a procedure 
> >> in which I import a data frame, see which variables are blank, and delete 
> >> those variables. My data frames have variables named P1 to P136 and Q1 to 
> >> Q136.
> >> I have a couple of questions regarding this issue:
> >> 1) Is a loop an efficient way to address this problem? If not, what are my 
> >> alternatives and how do I implement them?2) I have been playing with a 
> >> single data frame to try to figure out a way of having R go through the 
> >> columns and see which ones it should delete. I have figured out how to 
> >> delete rows with missing data (newdata<- na.omit(olddata)) but how do I do 
> >> it for columns???
[[elided Hotmail spam]]
> >> Rita ________________________________________ "If you think education is 
> >> expensive, try ignorance"--Derek Bok
> >>
> >>
> >>
> >>         [[alternative HTML version deleted]]
> >>
> >> ______________________________________________
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> >
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How do I delete multiple blank variables from a data frame?

Reply via email to