Thanks Peter and David, Peter, that is exactly what I was looking for. Sadly I have even used is.na() in the past but forgot about it. David, thanks for the tip. I am embarrassed that I exposed the fact that I use a lot of loops. findInterval() seems very handy.
Wade On Wed, Dec 8, 2010 at 4:19 PM, David Winsemius <dwinsem...@comcast.net>wrote: > > On Dec 8, 2010, at 3:10 PM, Wade Wall wrote: > > Hi all, >> >> How can one evaluate NAs in a numeric dataframe column? For example, I >> have >> a dataframe (demo) with a column of numbers and several NAs. If I write >> demo.df >= 10, numerals will return TRUE or FALSE, but if the value is >> "NA", "NA" is returned. But if I write demo.df == "NA", it returns as >> "NA" >> also. I know that I can remove NAs, but would like to keep the dataframe >> as >> is without creating a subset. I basically want to add a line that >> evaluates >> the NA in the demo dataframe. >> > > That looks really, really painful. Why not use the function findInterval > and then do a lookup in a character vector. Then you can throw away that > loopy construct completely. > > > demo <- data.frame(Area = runif(10, 0, 100)) > > demo$catarea <- findInterval(demo$Area, c(0,25,50,75,100)) > > demo > Area catarea > 1 71.440401 3 > 2 8.438097 1 > 3 45.492178 2 > 4 50.669996 3 > 5 15.444114 1 > 6 33.954948 2 > 7 19.738747 1 > 8 56.485654 3 > 9 29.218921 2 > 10 74.204611 3 > > demo$catname <- c("S01","S02", "S03","S04")[demo$catarea] > > demo > Area catarea catname > 1 71.440401 3 S03 > 2 8.438097 1 S01 > 3 45.492178 2 S02 > 4 50.669996 3 S03 > 5 15.444114 1 S01 > 6 33.954948 2 S02 > 7 19.738747 1 S01 > 8 56.485654 3 S03 > 9 29.218921 2 S02 > 10 74.204611 3 S03 > > -- > David. > >> >> As an example, I want to assign rows to classes based on values in >> demo$Area. Some of the values in demo$Area are "NA" >> >> for (i in 1:nrow(demo)) { >> if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2 >> if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ## >> 10-25cm2 >> if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## 25-50 >> cm2 >> if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ## >> 50-100 >> cm2 >> if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ## >> 100-200 cm2 >> if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ## >> 200-400 cm2 >> if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ## >> 400-800 cm2 >> if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} ## >> 800-1600 cm2 >> if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- "S09"} ## >> 1600-3200 cm2 >> if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2 >> } >> >> What happens is that I get the message "Error in if (demo$Area[i] > 0 && >> demo$Area[i] < 10) { : missing value where TRUE/FALSE needed" >> >> Thanks for any help >> >> Wade >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > David Winsemius, MD > West Hartford, CT > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.