Re: Content based recommender using lucene/solr

Otis Gospodnetic Fri, 28 Jun 2013 11:13:44 -0700

Hi,

It doesn't have to be one or the other.  In the past I've built a news
recommender engine based on CF (Mahout) and combined it with Content
Similarity-based engine (wasn't Solr/Lucene, but something custom that
worked with ngrams, but it may have as well been Lucene/Solr/ES).  It
worked well.  If you haven't worked with Mahout before I'd suggest the
approach in that video and going from there to Mahout only if it's
limiting.


See Ted's stuff on this topic, too:
http://www.slideshare.net/tdunning/search-as-recommendation +
http://berlinbuzzwords.de/sessions/multi-modal-recommendation-algorithms
(note: Mahout, Solr, Pig)

Otis
--
Solr & ElasticSearch Support -- http://sematext.com/
Performance Monitoring -- http://sematext.com/spm



On Fri, Jun 28, 2013 at 2:07 PM, Saikat Kanjilal <sxk1...@hotmail.com> wrote:
> You could build a custom recommender in mahout to accomplish this, also just 
> out of curiosity why the content based approach as opposed to building a 
> recommender based on co-occurence.  One other thing, what is your data size, 
> are you looking at scale where you need something like hadoop?
>
>> From: lcguerreroc...@gmail.com
>> Date: Fri, 28 Jun 2013 13:02:00 -0500
>> Subject: Re: Content based recommender using lucene/solr
>> To: solr-user@lucene.apache.org
>> CC: java-u...@lucene.apache.org
>>
>> Hey saikat, thanks for your suggestion. I've looked into mahout and other
>> alternatives for computing k nearest neighbors. I would have to run a job
>> and computer the k nearest neighbors and track them in the index for
>> retrieval. I wanted to see if this was something I could do with lucene
>> using lucene's scoring function and solr's morelikethis component. The job
>> you specifically mention is for Item based recommendation which would
>> require me to track the different items users have viewed. I'm looking for
>> a content based approach where I would use a distance measure to establish
>> how near items are (how similar) and have some kind of training phase to
>> adjust weights.
>>
>>
>> On Fri, Jun 28, 2013 at 12:42 PM, Saikat Kanjilal <sxk1...@hotmail.com>wrote:
>>
>> > Why not just use mahout to do this, there is an item similarity algorithm
>> > in mahout that does exactly this :)
>> >
>> >
>> > https://builds.apache.org/job/Mahout-Quality/javadoc/org/apache/mahout/cf/taste/hadoop/similarity/item/ItemSimilarityJob.html
>> >
>> > You can use mahout in distributed and non-distributed mode as well.
>> >
>> > > From: lcguerreroc...@gmail.com
>> > > Date: Fri, 28 Jun 2013 12:16:57 -0500
>> > > Subject: Content based recommender using lucene/solr
>> > > To: solr-user@lucene.apache.org; java-u...@lucene.apache.org
>> > >
>> > > Hi,
>> > >
>> > > I'm using lucene and solr right now in a production environment with an
>> > > index of about a million docs. I'm working on a recommender that
>> > basically
>> > > would list the n most similar items to the user based on the current item
>> > > he is viewing.
>> > >
>> > > I've been thinking of using solr/lucene since I already have all docs
>> > > available and I want a quick version that can be deployed while we work
>> > on
>> > > a more robust recommender. How about overriding the default similarity so
>> > > that it scores documents based on the euclidean distance of normalized
>> > item
>> > > attributes and then using a morelikethis component to pass in the
>> > > attributes of the item for which I want to generate recommendations? I
>> > know
>> > > it has its issues like recomputing scores/normalization/weight
>> > application
>> > > at query time which could make this idea unfeasible/impractical. I'm at a
>> > > very preliminary stage right now with this and would love some
>> > suggestions
>> > > from experienced users.
>> > >
>> > > thank you,
>> > >
>> > > Luis Guerrero
>> >
>> >
>>
>>
>>
>> --
>> Luis Carlos Guerrero Covo
>> M.S. Computer Engineering
>> (57) 3183542047
>

Re: Content based recommender using lucene/solr

Reply via email to