subject:"Facets based on sampling"

Re: Facets based on sampling

2017-10-24 Thread Toke Eskildsen

John Davis wrote: > 100M unique values might be across all docs, and unless the faceting > implementation is really naive I cannot see how that can come into play > when the query matches a fraction of those. Solr simple string faceting uses an int-array to hold counts for the different terms in

Re: Facets based on sampling

2017-10-24 Thread John Davis

On Tue, Oct 24, 2017 at 8:37 AM, Erick Erickson wrote: > bq: It is a bit surprising why facet computation > is so slow even when the query matches hundreds of docs. > > The number of terms in the field over all docs also comes into play. > Say you're faceting over a field that has 100,000,000 u

Re: Facets based on sampling

2017-10-24 Thread Erick Erickson

bq: It is a bit surprising why facet computation is so slow even when the query matches hundreds of docs. The number of terms in the field over all docs also comes into play. Say you're faceting over a field that has 100,000,000 unique values across all docs, that's a lot of bookkeeping. Best,

Re: Facets based on sampling

2017-10-24 Thread Emir Arnautović

Hi John, Did you mean “docValues don’t work for analysed fields” since it works for multivalue string (or other supported types) fields. What you need to do is to convert your analysed field to multivalue string field - that requires changes in indexing flow. HTH, Emir -- Monitoring - Log Mana

Re: Facets based on sampling

2017-10-23 Thread John Davis

Docvalues don't work for multivalued fields. I just started a separate thread with more debug info. It is a bit surprising why facet computation is so slow even when the query matches hundreds of docs. On Mon, Oct 23, 2017 at 6:53 AM, alessandro.benedetti wrote: > Hi John, > first of all, I may

Re: Facets based on sampling

2017-10-23 Thread alessandro.benedetti

Hi John, first of all, I may state the obvious, but have you tried docValues ? Apart from that a friend of mine ( Diego Ceccarelli) was discussing a probabilistic implementation similar to the hyperloglog[1] to approximate facets counting. I didn't have time to take a look in details / implement

Re: Facets based on sampling

2017-10-20 Thread John Davis

Hi Yonik, Any update on sampling based facets. The current faceting is really slow for fields with high cardinality even with method=uif. Or are there alternative work-arounds to only look at N docs when computing facets? On Fri, Nov 4, 2016 at 4:43 PM, Yonik Seeley wrote: > Sampling has been on

Re: Facets based on sampling

2016-11-05 Thread Mikhail Khludnev

Hello, John! You can try to do that manually by applying filter by random field. On Fri, Nov 4, 2016 at 10:02 PM, John Davis wrote: > Hi, > I am trying to improve the performance of queries with facets. I understand > that for queries with high facet cardinality and large number results the > c

Re: Facets based on sampling

2016-11-05 Thread Toke Eskildsen

From: John Davis wrote: > Does there exist an option to compute facets by just looking at the top-n > results instead of all of them or a sample of results based on some query > parameters? Doing it for the top-n results does not play well with the current query flow in Solr (I might be wrong he

Re: Facets based on sampling

2016-11-04 Thread Yonik Seeley

Sampling has been on my TODO list for the JSON Facet API. How much it would help depends on where the bottlenecks are, but that in conjunction with a hashing approach to collection (assuming field cardinality is high) should definitely help. -Yonik On Fri, Nov 4, 2016 at 3:02 PM, John Davis wro

Re: Facets based on sampling

2016-11-04 Thread Jeff Wartes

https://issues.apache.org/jira/browse/SOLR-5894 had some pretty interesting looking work on heuristic counts for facets, among other things. Unfortunately, it didn’t get picked up, but if you don’t mind using Solr 4.10, there’s a jar. On 11/4/16, 12:02 PM, "John Davis" wrote: Hi, I a

Re: Facets based on sampling

2016-11-04 Thread Alexandre Rafalovitch

I believe that's what's JSON facet API does by default. Have you tried that? Regards, Alex. Solr Example reading group is starting November 2016, join us at http://j.mp/SolrERG Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 5 November 2016 at

Facets based on sampling

2016-11-04 Thread John Davis

Hi, I am trying to improve the performance of queries with facets. I understand that for queries with high facet cardinality and large number results the current facet computation algorithms can be slow as they are trying to loop across all docs and facet values. Does there exist an option to comp

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Re: Facets based on sampling

Facets based on sampling

13 matches

Site Navigation

Mail list logo

Footer information