Hello,

Inline.

Em 07-08-2012 19:56, Abraham Mathew escreveu:
I have a data frame with a column of values that I want to bucket (group)
into specific levels.

str(dat)'data.frame':   3678 obs. of  39 variables:
  $ id                          : int  23 76 129 156 166 180 200 214 296 344 ...
  $ final_purchase_amount       : Factor w/ 32 levels
"\\N","1082","1109",..: 1 1 1 1 1 1 1 1 1 1 ...


So I ran the following to produce new levels, one for values from 100
to 400, 401 to 1000, and 1001+.


dat$final_purchase_amount<- NA
dat$final_purchase_amount[dat$final_purchase_amount %in%
levels(dat$final_purchase_amount)[c(8,9,11,12,13,15,16,17,18,19,20,21)]]
<- "100 to 400"
dat$final_purchase_amount[dat$final_purchase_amount %in%
levels(dat$final_purchase_amount)[c(22,23,24,25,26,27,28,29,30,31,32)]]
<- "401 to 1000"
dat$final_purchase_amount[dat$final_purchase_amount %in%
levels(dat$final_purchase_amount)[c(2,3,4,5,6,7,10,14)]] <- "1001 +"
dat$final_purchase_amount <- factor(dat$final_purchase_amount)
levels(dat$final_purchase_amount)
table(dat$final_purchase_amount)



However, this doesn't seem to produce any levels
Fortunately not! You have started by setting the entire column vector to NA in your first instruction above, then try several times to find that vector of NAs %in% levels numbers c(8,9, ...etc...) or c(22,23, ...etc..). Your first line of code makes everything else relative to dat$final_purchase_amount useless. (I believe that that line should be deleted.)

Hope this helps,

Rui Barradas

  and returns the following.


levels(dat$final_purchase_amount)character(0)


Can anyone point to what I'm doing wrong.



Thanks!



______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to