Re: Optimizing fq query performance

2019-04-18 Thread John Davis
FYI https://issues.apache.org/jira/browse/SOLR-11437 https://issues.apache.org/jira/browse/SOLR-12488 On Thu, Apr 18, 2019 at 7:24 AM Shawn Heisey wrote: > On 4/17/2019 11:49 PM, John Davis wrote: > > I did a few tests with our instance solr-7.4.0 and field:* vs field:[* TO > > *] doesn't seem m

Re: Optimizing fq query performance

2019-04-18 Thread Shawn Heisey
On 4/17/2019 11:49 PM, John Davis wrote: I did a few tests with our instance solr-7.4.0 and field:* vs field:[* TO *] doesn't seem materially different compared to has_field:1. If no one knows why Lucene optimizes one but not another, it's not clear whether it even optimizes one to be sure. Que

Re: Optimizing fq query performance

2019-04-17 Thread John Davis
I did a few tests with our instance solr-7.4.0 and field:* vs field:[* TO *] doesn't seem materially different compared to has_field:1. If no one knows why Lucene optimizes one but not another, it's not clear whether it even optimizes one to be sure. On Wed, Apr 17, 2019 at 4:27 PM Shawn Heisey w

Re: Optimizing fq query performance

2019-04-17 Thread Shawn Heisey
On 4/17/2019 1:21 PM, John Davis wrote: If what you describe is the case for range query [* TO *], why would lucene not optimize field:* similar way? I don't know. Low level lucene operation is a mystery to me. I have seen first-hand that the range query is MUCH faster than the wildcard quer

Re: Optimizing fq query performance

2019-04-17 Thread John Davis
If what you describe is the case for range query [* TO *], why would lucene not optimize field:* similar way? On Wed, Apr 17, 2019 at 10:36 AM Shawn Heisey wrote: > On 4/17/2019 10:51 AM, John Davis wrote: > > Can you clarify why field:[* TO *] is lot more efficient than field:* > > It's a range

Re: Optimizing fq query performance

2019-04-17 Thread Shawn Heisey
On 4/17/2019 10:51 AM, John Davis wrote: Can you clarify why field:[* TO *] is lot more efficient than field:* It's a range query. For every document, Lucene just has to answer two questions -- is the value more than any possible value and is the value less than any possible value. The answ

Re: Optimizing fq query performance

2019-04-17 Thread John Davis
Can you clarify why field:[* TO *] is lot more efficient than field:* On Sun, Apr 14, 2019 at 12:14 PM Shawn Heisey wrote: > On 4/13/2019 12:58 PM, John Davis wrote: > > We noticed a sizable performance degradation when we add certain fq > filters > > to the query even though the result set does

Re: Optimizing fq query performance

2019-04-14 Thread Shawn Heisey
On 4/13/2019 12:58 PM, John Davis wrote: We noticed a sizable performance degradation when we add certain fq filters to the query even though the result set does not change between the two queries. I would've expected solr to optimize internally by picking the most constrained fq filter first, bu

Re: Optimizing fq query performance

2019-04-14 Thread Erick Erickson
Patches welcome, but how would that be done? There’s no fixed schema at the Lucene level. It’s even possible that no two documents in the index have any fields in common. Given the structure of an inverted index, answering the question “for document X does it have any value?" is rather “interes

Re: Optimizing fq query performance

2019-04-13 Thread John Davis
> field1:* is slow in general for indexed fields because all terms for the > field need to be iterated (e.g. does term1 match doc1, does term2 match > doc1, etc) This feels like something could be optimized internally by tracking existence of the field in a doc instead of making users index yet an

Re: Optimizing fq query performance

2019-04-13 Thread Erick Erickson
Also note that field1:* does not necessarily match all documents. A document without that field will not match. So it really can’t be optimized they way you might expect since, as Yonik says, all the terms have to be enumerated…. Best, Erick > On Apr 13, 2019, at 12:30 PM, Yonik Seeley wrote:

Re: Optimizing fq query performance

2019-04-13 Thread Yonik Seeley
More constrained but matching the same set of documents just guarantees that there is more information to evaluate per document matched. For your specific case, you can optimize fq = 'field1:* AND field2:value' to &fq=field1:*&fq=field2:value This will at least cause field1:* to be cached and reuse

Optimizing fq query performance

2019-04-13 Thread John Davis
Hi there, We noticed a sizable performance degradation when we add certain fq filters to the query even though the result set does not change between the two queries. I would've expected solr to optimize internally by picking the most constrained fq filter first, but maybe my understanding is wron