Re: Facet Performance

Erick Erickson Fri, 12 Jun 2020 10:59:08 -0700

I question whether fiterCache has anything to do with it, I suspect what’s 
really happening is that first time you’re reading the relevant bits from disk 
into memory. And to double check you should have docVaues enabled for all these 
fields. The “uninverting” process  can be very expensive, and docValues 
bypasses that.

As of Solr 7.6, you can define “uninvertible=true” to your field(Type) to “fail 
fast” if Solr needs to uninvert the field.

But that’s an aside. In either case, my claim is that first-time execution does 
“something”, either reads the serialized docValues from disk or uninverts the 
file on Solr’s heap.

You can have this autowarmed by any combination of
1> specifying an autowarm count on your queryResultCache. That’s hit or miss, 
as it replays the most recent N queries which may or may not contain the sorts. 
That said, specifying 10-20 for autowarm count is usually a good idea, assuming 
you’re not committing more than, say, every 30 seconds. I’d add the same to 
filterCache too.

2> specifying a newSearcher or firstSearcher query in solrconfig.xml. The 
difference is that newSearcher is fired every time a commit happens, while 
firstSearcher is only fired when Solr starts, the theory being that there’s no 
cache autowarming available when Solr fist powers up. Usually, people don’t 
bother with firstSearcher or just make it the same as newSearcher. Note that a 
query doesn’t have to be “real” at all. You can just add all the facet fields 
to a *:* query in a single go.

BTW, Trie fields will stay around for a long time even though deprecated. Or at 
least until we find something to replace them with that doesn’t have this 
penalty, so I’d feel pretty safe using those and they’ll be more efficient than 
strings.

Best,
Erick

> On Jun 12, 2020, at 12:39 PM, James Bodkin <james.bod...@loveholidays.com> 
> wrote:
> 
> We've run the performance test after changing the fields to be of the type 
> string. We're seeing improved performance, especially after the first time 
> the query has run. The first run is taking around 1-2 seconds rather than 6-8 
> seconds and when the filter cache is present, the response time is around 
> 400ms.
> Do you have any more suggestions that we could try in order to optimise the 
> performance?
> 
> On 11/06/2020, 14:49, "Erick Erickson" <erickerick...@gmail.com> wrote:
> 
>    There’s a lot of confusion about using points-based fields for faceting, 
> see: https://issues.apache.org/jira/browse/SOLR-13227 for instance.
> 
>    Two options you might try:
>    1> copyField to a string field and facet on that (won’t work, of course, 
> for any kind of interval/range facet)
>    2> use the deprecated Trie field instead. You could use the copyField to a 
> Trie field for this too.
> 
>    Best,
>    Erick
>

Re: Facet Performance

Reply via email to