On Nov 9, 2012, at 5:10 AM, Tagmarie wrote: > I have a data frame somewhat like this: > > myframe <- data.frame (ID=c(2,3,4,5), Hunger =c(415,452,550,318 )) > myframe > > Now I would like to add a column to the right which summarizes the values > for Hunger somewhat to reduce the number of values: If the values for Hunger > are between > 300-400 I would like to insert the number 350, > between > 400-500 insert 450 > between > 500-600 insert 550 > > Does anyone know how? Cause I don't and my brain already hurts. Can't be > that difficult, right?
> myframe$grpH <- c(350, 450, 550)[ findInterval(myframe$Hunger, c(300, 400, 500, 600) ) ] > myframe ID Hunger grpH 1 2 415 450 2 3 452 450 3 4 550 550 4 5 318 350 Please note that your specification had overlapping intervals and that 'findInterval' by default uses closed intervals on the left and open intervals on the right. (This is the opposite of the default behavior of cut(). ) So I suppose you could say R's implementations are just as ambiguous as your problem specification. > myframe$grpHc <- cut(myframe$Hunger, breaks=c(300, 400, 500, 600), > labels=c("350", "450", "550") ) > myframe ID Hunger grpH grpHc 1 2 415 450 450 2 3 452 450 450 3 4 550 550 550 4 5 318 350 350 Note also that cut returns a factor: > lapply(myframe, class) $ID [1] "numeric" $Hunger [1] "numeric" $grpH [1] "numeric" $grpHc [1] "factor" -- David Winsemius, MD Alameda, CA, USA ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.