Here's how I'm trying to solve the diversity problem inherent in the data
(see below for a definition of the problem):
if (interquintile ranges have >=4 ranges at the same freq) then (use
rating=3)
else
(use rating as described in jim's code)
i'll have a go and post an update. in the mean time, if
Jim's suggestion did the trick:
tqm <- do.call(rbind, tq) + 0.001
head(x.new) userid freq track rating
[1,] 11 1 1
[2,] 1 10 2 5
[3,] 11 3 1
[4,] 11 4 1
[5,] 1 15 5 5
[6,] 14 6 3
Dennis, w
An easy way is to just offset the quantiles by a small increment so
that boundary condition is less likely. If you change the line
tqm <- do.call(rbind, tq) + 0.001
in my example, that should do the trick.
On Sat, May 14, 2011 at 6:09 PM, gj wrote:
> Hi,
> I think I haven't been able to explai
Hi,
I think I haven't been able to explain correctly what I want. Here another
try:
Given that I have the following input:
userid,track,freq
1,1,1
1,2,10
1,3,1
1,4,1
1,5,15
1,6,4
1,7,16
1,8,6
1,9,1
1,10,1
1,11,2
1,12,2
1,13,1
1,14,6
1,15,7
1,16,13
1,17,3
1,18,2
1,19,5
1,20,2
1,21,2
1,22,6
1,23,4
1
Hi:
Is this what you're after?
tq <- with(ds, quantile(freq, seq(0.2, 1, by = 0.2)))
ds$int <- with(ds, cut(freq, c(0, tq)))
with(ds, table(int))
int
(0,1] (1,2] (2,4] (4,7] (7,16]
10 6 7 6 6
HTH,
Dennis
On Sat, May 14, 2011 at 9:42 AM, gj wrote:
> Hi Jim,
> Thanks
Hi Jim,
Thanks very much for the code. I modified it a bit because I needed to
allocate the track ratings by userid (eg if user 1 plays track x once, he
gets rating 1, user 1 plays track y 100 times, he gets a rating 5) and not
by track (sorry if this wasn't clear in my original post).
This is alm
One way to get the ratings would be to use the ave() function:
rating = ave(x$freq,x$track,
FUN=function(x)cut(x,quantile(x,(0:5)/5),include.lowest=TRUE))
- Phil Spector
Statistical Computing Facility
try this:
> # create some data
> x <- data.frame(userid = paste('u', rep(1:20, each = 20), sep = '')
+ , track = rep(1:20, 20)
+ , freq = floor(runif(400, 10, 200))
+ , stringsAsFactors = FALSE
+ )
> # get the quantiles for each track
> tq <-
Hi,
I have a mysql table with fields userid,track,frequency e.g
u1,1,10
u1,2,100
u1,3,110
u1,4,200
u1,5,120
u1,6,130
.
u2,1,23
.
.
where "frequency" is the number of times a music track is played by a
"userid"
I need to turn my 'frequency' table into a rating table (it's for a
recommender system)
9 matches
Mail list logo