On 19/03/11 01:35, Joshua Wiley wrote:
Hi Rita,
This is far from the most efficient or elegant way, but:
## two column data frame, one all NAs
d<- data.frame(1:10, NA)
## use apply to create logical vector and subset d
d[, apply(d, 2, function(x) !all(is.na(x)))]
This works, but apply converts d to a matrix which is not needed, so try
d[, sapply(d, function(x) !all(is.na(x)))]
if performance is an issue (apply is about 3x slower on your test data
frame d, more for larger data frames).
For the related problem of removing columns of constant-or-na values,
the best I could come up with is
zv.1 <- function(x) {
## The literal approach
y <- var(x, na.rm = TRUE)
return(is.na(y) || y == 0)
}
sapply(train, zv.1)
See
http://www.cybaea.net/Blogs/Data/R-Eliminating-observed-values-with-zero-variance.html
for the benchmarks.
Allan
I am just apply()ing to each column (the 2) of d, the function
!all(is.na(x)) which will return FALSE if all of x is missing and TRUE
otherwise. The result is a logical vector the same length as the
number of columns in d that is used to subset only the d columns with
at least some non-missing values. For documentation see:
?apply
?is.na
?all
?"["
?Logic
HTH,
Josh
On Fri, Mar 18, 2011 at 3:35 PM, Rita Carreira<ritacarre...@hotmail.com> wrote:
Dear List Members,I have 55 data frames, each of which with 272 variables and
267 observations. Some of these variables are blanks but the blanks are not the
same for every data frame. I would like to write a procedure in which I import
a data frame, see which variables are blank, and delete those variables. My
data frames have variables named P1 to P136 and Q1 to Q136.
I have a couple of questions regarding this issue:
1) Is a loop an efficient way to address this problem? If not, what are my
alternatives and how do I implement them?2) I have been playing with a single data
frame to try to figure out a way of having R go through the columns and see which
ones it should delete. I have figured out how to delete rows with missing data
(newdata<- na.omit(olddata)) but how do I do it for columns???
Thank you very much for your help and have a great weekend!
Rita ________________________________________ "If you think education is expensive,
try ignorance"--Derek Bok
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.