try this: > # create some data > x <- data.frame(userid = paste('u', rep(1:20, each = 20), sep = '') + , track = rep(1:20, 20) + , freq = floor(runif(400, 10, 200)) + , stringsAsFactors = FALSE + ) > # get the quantiles for each track > tq <- tapply(x$freq, x$track, quantile, prob = c(.2, .4, .6, .8, 1)) > # create a matrix with the rownames as the tracks to use in the findInterval > tqm <- do.call(rbind, tq) > # now put the ratings > require(data.table) > x.dt <- data.table(x) > x.new <- x.dt[, + list(userid = userid + , freq = freq + , rating = findInterval(freq + # use track as index into quantile matrix + , tqm[as.character(track[1L]),] + , rightmost.closed = TRUE + ) + 1L + ) + , by = track] > > head(x.new) track userid freq rating [1,] 1 u1 10 1 [2,] 1 u2 15 1 [3,] 1 u3 126 4 [4,] 1 u4 117 3 [5,] 1 u5 76 2 [6,] 1 u6 103 3 >
On Sun, May 8, 2011 at 2:48 PM, gj <gaw...@gmail.com> wrote: > Hi, > > I have a mysql table with fields userid,track,frequency e.g > u1,1,10 > u1,2,100 > u1,3,110 > u1,4,200 > u1,5,120 > u1,6,130 > . > u2,1,23 > . > . > where "frequency" is the number of times a music track is played by a > "userid" > > I need to turn my 'frequency' table into a rating table (it's for a > recommender system). So, for each user, I need to categorise the frequency > of tracks played by quintile so that each particular track can have 5 > ratings (1-5), with the ratings allocated as follows: inter-quintile range > 100-80% = rating 5, inter-quintile range 80-60% = rating 4, > ..., inter-quintile range 20-0% = rating 1) > > Hence, I need to create a table with fields userid,track,rating: > u1,1,1 > u1,2, 3 > ... > > Can anybody help me to do this with R? > > Regards > Gawesh > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.