The other option is to use various MLT params to return fewer similar
documents to begin with.

Otis
Solr & ElasticSearch Support
http://sematext.com/
On May 21, 2013 7:00 PM, "Jack Krupansky" <j...@basetechnology.com> wrote:

> I think I follow. AFAIK, Solr does not have a provision for limiting
> faceting to the "top n" documents, but that does see like a reasonable
> feature request. At the Lucene I presume it would simply be a matter of
> having a hit collector that only accepts the top n documents. But, I'm not
> familiar enough with the internal details of the Solr faceting code.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Achim Domma
> Sent: Tuesday, May 21, 2013 6:39 PM
> To: solr-user@lucene.apache.org
> Subject: Re: MoreLikeThisHandler + Facets
>
> Our current index contains nearly 400k documents and will grow to a few
> millions. Our "more like this"-search is always based on a single document,
> so my query is "id:some_doc_id". For such a query I usually get at least
> 150k "similar" documents. This definition of "similar" is way so relaxed.
> Usually only a few hundred or thousand documents near the reference
> document are really of any interest to our users.
>
> Now assume that I get some facet values, which appear very often in the
> similar documents starting at position 50k, but usually not near the
> reference document. This facet will show currently show up in my facet
> results. If I use this facet value for filtering, I restrict to result to
> documents which are not of any interest to the user.
>
> We want to provide facets, which allow the user to explore and trill down
> the documents in the near neighborhood of our reference document.
>
> If I'm on the complete wrong track, please let me know. I'm open for any
> suggestions. Is it possible, that just our definition of "similar" does not
> match Solrs model? I would also be willing to dig into code and to
> implement a custom similarity. But currently it feels like I don't get the
> base concepts right!? Any hint and guidance would be very welcome.
>
> kind regards,
> Achim
>
>
> Am 21.05.2013 um 15:27 schrieb Jack Krupansky:
>
>  Any particular reason you would want to limit the documents for facet
>> calculation? I mean, the whole point of the facet numbers is to let users
>> know what's out there. You must have some other rationale in mind - what is
>> it?
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Achim Domma
>> Sent: Tuesday, May 21, 2013 5:47 AM
>> To: solr-user@lucene.apache.org
>> Subject: MoreLikeThisHandler + Facets
>>
>> Im calling the MoreLikeThisHandler with a query like "id:some_doc_id", so
>> I would like to get documents which are similar to one specific document. I
>> restrict the result to 25 rows and I calculate facets for some fields.
>>
>> On what data are those facets calculated? According to the documentation
>> out of the similar documents, which is the main difference to the default
>> search handler. But on how many of them? Is it possible to restrict the
>> documents somehow? I would like my facets to be calculated based only on
>> the top 1000 most similar documents.
>>
>> kind regards,
>> Achim=
>>
>
>

Reply via email to