You could build a custom recommender in mahout to accomplish this, also just out of curiosity why the content based approach as opposed to building a recommender based on co-occurence. One other thing, what is your data size, are you looking at scale where you need something like hadoop?
> From: lcguerreroc...@gmail.com > Date: Fri, 28 Jun 2013 13:02:00 -0500 > Subject: Re: Content based recommender using lucene/solr > To: solr-user@lucene.apache.org > CC: java-u...@lucene.apache.org > > Hey saikat, thanks for your suggestion. I've looked into mahout and other > alternatives for computing k nearest neighbors. I would have to run a job > and computer the k nearest neighbors and track them in the index for > retrieval. I wanted to see if this was something I could do with lucene > using lucene's scoring function and solr's morelikethis component. The job > you specifically mention is for Item based recommendation which would > require me to track the different items users have viewed. I'm looking for > a content based approach where I would use a distance measure to establish > how near items are (how similar) and have some kind of training phase to > adjust weights. > > > On Fri, Jun 28, 2013 at 12:42 PM, Saikat Kanjilal <sxk1...@hotmail.com>wrote: > > > Why not just use mahout to do this, there is an item similarity algorithm > > in mahout that does exactly this :) > > > > > > https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.html > > > > You can use mahout in distributed and non-distributed mode as well. > > > > > From: lcguerreroc...@gmail.com > > > Date: Fri, 28 Jun 2013 12:16:57 -0500 > > > Subject: Content based recommender using lucene/solr > > > To: solr-user@lucene.apache.org; java-u...@lucene.apache.org > > > > > > Hi, > > > > > > I'm using lucene and solr right now in a production environment with an > > > index of about a million docs. I'm working on a recommender that > > basically > > > would list the n most similar items to the user based on the current item > > > he is viewing. > > > > > > I've been thinking of using solr/lucene since I already have all docs > > > available and I want a quick version that can be deployed while we work > > on > > > a more robust recommender. How about overriding the default similarity so > > > that it scores documents based on the euclidean distance of normalized > > item > > > attributes and then using a morelikethis component to pass in the > > > attributes of the item for which I want to generate recommendations? I > > know > > > it has its issues like recomputing scores/normalization/weight > > application > > > at query time which could make this idea unfeasible/impractical. I'm at a > > > very preliminary stage right now with this and would love some > > suggestions > > > from experienced users. > > > > > > thank you, > > > > > > Luis Guerrero > > > > > > > > -- > Luis Carlos Guerrero Covo > M.S. Computer Engineering > (57) 3183542047