On 2010-06-22 1:45, steven mosher wrote:
Hmm
DF<-data.frame(name=rep(1:5,each=2),x1=rep("A",10),x2=seq(10,19,by=1),x3=rep(NA,10),x4=seq(20,29,by=1))
DF$x3[5]<-50
mask<-apply(sample,2,"%in%", target)
This is getting confusing. What's 'sample'?
What's 'target'? Probably what you originally called 'targets'.
DF
name x1 x2 x3 x4
1 1 A 10 NA 20
2 1 A 11 NA 21
3 2 A 12 NA 22
4 2 A 13 NA 23
5 3 A 14 50 24
6 3 A 15 NA 25
7 4 A 16 NA 26
8 4 A 17 NA 27
9 5 A 18 NA 28
10 5 A 19 NA 29
mask
[,1] [,2] [,3] [,4] [,5]
[1,] FALSE FALSE FALSE FALSE FALSE
[2,] FALSE FALSE FALSE FALSE FALSE
[3,] TRUE TRUE FALSE TRUE FALSE
[4,] FALSE FALSE FALSE FALSE FALSE
[5,] TRUE FALSE FALSE FALSE FALSE
This suggests that 'sample' may be a matrix, not
a dataframe.
Anyway, try this on your original problem:
targets<-c(11,12,13,16,19,50,27,24,22,26)
mask<-apply(DF[,3:5],2, "%in%" ,targets)
is.na(DF[3:5]) <- !mask
-Peter Ehlers
mask<-data.frame(a=TRUE,b=TRUE,!mask)
DF[mask]<-NA
Error in FUN(X[[1L]], ...) :
only defined on a data frame with all numeric variables
DF2<-data.frame(DF[,3:5])
mask<-apply(sample,2,"%in%", target)
mask<-data.frame(!mask)
DF2[mask]<-NA
Error in FUN(X[[1L]], ...) :
only defined on a data frame with all numeric variables
DF2
x2 x3 x4
1 10 NA 20
2 11 NA 21
3 12 NA 22
4 13 NA 23
5 14 50 24
6 15 NA 25
7 16 NA 26
8 17 NA 27
9 18 NA 28
10 19 NA 29
mask<-apply(DF2,2,"%in%", target)
mask<-data.frame(!mask)
DF2[mask]<-NA
Error in FUN(X[[1L]], ...) :
only defined on a data frame with all numeric variables
On Tue, Jun 22, 2010 at 12:23 AM, Petr PIKAL<petr.pi...@precheza.cz> wrote:
Hi
r-help-boun...@r-project.org napsal dne 22.06.2010 08:28:04:
The following dataframe will illustrate the problem
DF<-data.frame(name=rep(1:5,each=2),x1=rep("A",10),x2=seq(10,19,by=1),x3=rep
(NA,10),x4=seq(20,29,by=1))
DF$x3[5]<-50
# we have a data frame. we are interested in the columns x2,x3,x4 which
contain sparse
# values and many NA.
DF
name x1 x2 x3 x4
1 1 A 10 NA 20
2 1 A 11 NA 21
3 2 A 12 NA 22
4 2 A 13 NA 23
5 3 A 14 50 24
6 3 A 15 NA 25
7 4 A 16 NA 26
8 4 A 17 NA 27
9 5 A 18 NA 28
10 5 A 19 NA 29
# we have a list of "target values that we want to search for in the
data
frame
# if the value is in the data frame we want to keep it there, otherwise,
replace it with NA
targets<-c(11,12,13,16,19,50,27,24,22,26)
# so we apply a test by column to the last 3 columns using the "in" test
# this gives us a mask of whether the data frame 'contains' elements in
the
# target list
mask<-apply(DF[,3:5],2, "%in%" ,targets)
mask
x2 x3 x4
[1,] FALSE FALSE FALSE
[2,] TRUE FALSE FALSE
[3,] TRUE FALSE TRUE
[4,] TRUE FALSE FALSE
[5,] FALSE TRUE TRUE
[6,] FALSE FALSE FALSE
[7,] TRUE FALSE TRUE
[8,] FALSE FALSE TRUE
[9,] FALSE FALSE FALSE
[10,] TRUE FALSE FALSE
# and so DF[2,3] is equal to 11 and 11 is in the target list, so the
mask is
True
# now something like DF<- ifelse(mask==T,DF,NA) is CONCEPTUALLY what I
want
Data frames are quite clever in preserving their dimensions. I would do
mask=data.frame(a=TRUE, b=TRUE, !mask)
to add column 1 and 2
and
DF[mask]<-NA
Regards
Petr
to do
in the end I'd Like a result that looks like
name x1 x2 x3 x4
1 1 A NA NA NA
2 1 A 11 NA NA
3 2 A 12 NA 22
4 2 A 13 NANA
5 3 A NA 50 24
6 3 A NA NA NA
7 4 A 16 NA 26
8 4 A NA NA 27
9 5 A NA NA NA
10 5 A 19 NA NA
Ive tried forcing the DF and the mask into vectors so that ifelse()
would
work
and have tried "apply" using ifelse.. without much luck. any thoughts?
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.