Thanks Peter and David,

Peter, that is exactly what I was looking for.  Sadly I have even used is.na()
in the past but forgot about it.  David, thanks for the tip.  I am
embarrassed that I exposed the fact that I use a lot of loops.
findInterval() seems very handy.

Wade

On Wed, Dec 8, 2010 at 4:19 PM, David Winsemius <dwinsem...@comcast.net>wrote:

>
> On Dec 8, 2010, at 3:10 PM, Wade Wall wrote:
>
>  Hi all,
>>
>> How can one evaluate NAs in a numeric dataframe column?  For example, I
>> have
>> a dataframe (demo) with a column of numbers and several NAs. If I write
>> demo.df >= 10, numerals will return TRUE or FALSE, but if the value is
>> "NA", "NA" is returned.  But if I write demo.df == "NA", it returns as
>> "NA"
>> also.  I know that I can remove NAs, but would like to keep the dataframe
>> as
>> is without creating a subset.  I basically want to add a line that
>> evaluates
>> the NA in the demo dataframe.
>>
>
> That looks really, really painful. Why not use the function findInterval
> and then do a lookup in a character vector. Then you can throw away that
> loopy construct completely.
>
> > demo  <- data.frame(Area = runif(10, 0, 100))
> > demo$catarea <- findInterval(demo$Area, c(0,25,50,75,100))
> > demo
>        Area catarea
> 1  71.440401       3
> 2   8.438097       1
> 3  45.492178       2
> 4  50.669996       3
> 5  15.444114       1
> 6  33.954948       2
> 7  19.738747       1
> 8  56.485654       3
> 9  29.218921       2
> 10 74.204611       3
> > demo$catname <- c("S01","S02", "S03","S04")[demo$catarea]
> > demo
>        Area catarea catname
> 1  71.440401       3     S03
> 2   8.438097       1     S01
> 3  45.492178       2     S02
> 4  50.669996       3     S03
> 5  15.444114       1     S01
> 6  33.954948       2     S02
> 7  19.738747       1     S01
> 8  56.485654       3     S03
> 9  29.218921       2     S02
> 10 74.204611       3     S03
>
> --
> David.
>
>>
>> As an example, I want to assign rows to classes based on values in
>> demo$Area. Some of the values in demo$Area are "NA"
>>
>> for (i in 1:nrow(demo)) {
>>  if (demo$Area[i] > 0 && demo$Area[i] < 10) {Class[i]<-"S01"} ## 1-10 cm2
>>  if (demo$Area[i] >= 10 && demo$Area[i] < 25) {Class[i] <- "S02"} ##
>> 10-25cm2
>>  if (demo$Area[i] >= 25 && demo$Area[i] < 50) {Class[i] <-"S03"} ## 25-50
>> cm2
>>  if (demo$Area[i] >= 50 && demo$Area[i] < 100) {Class[i] <-"S04"} ##
>> 50-100
>> cm2
>>  if (demo$Area[i] >= 100 && demo$Area[i] < 200) {Class[i] <- "S05"} ##
>> 100-200 cm2
>>  if (demo$Area[i] >= 200 && demo$Area[i] < 400) {Class[i] <- "S06"} ##
>> 200-400 cm2
>>  if (demo$Area[i] >= 400 && demo$Area[i] < 800) {Class[i] <- "S07"} ##
>> 400-800 cm2
>>  if (demo$Area[i] >= 800 && demo$Area[i] < 1600) {Class[i] <- "S08"} ##
>> 800-1600 cm2
>>  if (demo$Area[i] >= 1600 && demo$Area[i] < 3200) {Class[i] <- "S09"} ##
>> 1600-3200 cm2
>>  if (demo$Area[i] >=3200) {Class[i] <- "S10"} ## >3200 cm2
>>  }
>>
>> What happens is that I get the message "Error in if (demo$Area[i] > 0 &&
>> demo$Area[i] < 10) { : missing value where TRUE/FALSE needed"
>>
>> Thanks for any help
>>
>> Wade
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> David Winsemius, MD
> West Hartford, CT
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to