Maybe overthinking this. There is a “more like this” feature at basically does 
this. Give that a try before digging deeper into the LTR methods. It may be 
good enough for rock and roll.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 28, 2018, 12:25 PM -0400, Xavier Schepler 
<xavier.schep...@recommerce.com>, wrote:
> Hello,
>
> I'm considering using Solr with learning to rank to build a product matcher.
> For example, it should match the titles:
> - Apple iPhone 6 16 Gb,
> - iPhone 6 16 Gb,
> - Smartphone IPhone 6 16 Gb,
> - iPhone 6 black 16 Gb,
> to the same internal reference, an unique identifier.
>
> With Solr, each document would then have a field for the product title and
> one for its class, which is the unique identifier of the product.
> Solr would then be used to perform matching as follows.
>
> 1. A search is performed with a given product title.
> 2. The first three results are considered (this requires an initial
> product title database).
> 3. The most frequent identifier is returned.
>
> This method corresponds roughly to a k-Nearest Neighbor approach with the
> cosine metric, k = 3, and a TF-IDF model.
>
> I've done some preliminary tests with Sci-kit learn and the results are
> good, but not as good as the ones of more sophisticated learning algorithms.
>
> Then, I noticed that there exists learning to rank with Solr.
>
> First, do you think that such an use of Solr makes sense?
> Second, is there a relatively simple way to build a learning model using a
> sparse representation of the query TF-IDF vector?
>
> Kind regards,
>
> Xavier Schepler

Reply via email to