Re: facets & docValues

Revas Tue, 05 May 2020 14:55:44 -0700

Hi joel, No, we have not, we have softCommit requirement of 2 secs.

On Tue, May 5, 2020 at 3:31 PM Joel Bernstein <joels...@gmail.com> wrote:


> Have you configured static warming queries for the facets? This will warm
> the cache structures for the facet fields. You just want to make sure you
> commits are spaced far enough apart that the warming completes before a new
> searcher starts warming.
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Mon, May 4, 2020 at 10:27 AM Revas <revas2...@gmail.com> wrote:
>
> > Hi Erick, Thanks for the explanation and advise. With facet queries, does
> > doc Values help at all ?
> >
> > 1) indexed=true, docValues=true =>  all facets
> >
> > 2)
> >
> >    -  indexed=true , docValues=true => only for subfacets
> >    - inexed=true, docValues=false=> facet query
> >    - docValues=true, indexed=false=> term facets
> >
> >
> >
> > In case of 1 above, => Indexing slowed considerably. over all facet
> > performance improved many fold
> > In case of  2            =>  over all performance showed only slight
> > improvement
> >
> > Does that mean turning on docValues even for facet query helps improve
> the
> > performance,  fetching from docValues for facet query is faster than
> > fetching from stored fields ?
> >
> > Thanks
> >
> >
> > On Thu, Apr 16, 2020 at 1:50 PM Erick Erickson <erickerick...@gmail.com>
> > wrote:
> >
> > > DocValues should help when faceting over fields, i.e. facet.field=blah.
> > >
> > > I would expect docValues to help with sub facets and, but don’t know
> > > the code well enough to say definitely one way or the other.
> > >
> > > The empirical approach would be to set “uninvertible=true” (Solr 7.6)
> and
> > > turn docValues off. What that means is that if any operation tries to
> > > uninvert
> > > the index on the Java heap, you’ll get an exception like:
> > > "can not sort on a field w/o docValues unless it is indexed=true
> > > uninvertible=true and the type supports Uninversion:”
> > >
> > > See SOLR-12962
> > >
> > > Speed is only one issue. The entire point of docValues is to not
> > “uninvert”
> > > the field on the heap. This used to lead to very significant memory
> > > pressure. So when turning docValues off, you run the risk of
> > > reverting back to the old behavior and having unexpected memory
> > > consumption, not to mention slowdowns when the uninversion
> > > takes place.
> > >
> > > Also, unless your documents are very large, this is a tiny corpus. It
> can
> > > be
> > > quite hard to get realistic numbers, the signal gets lost in the noise.
> > >
> > > You should only shard when your individual query times exceed your
> > > requirement. Say you have a 95%tile requirement of 1 second response
> > time.
> > >
> > > Let’s further say that you can meet that requirement with 50
> > > queries/second,
> > > but when you get to 75 queries/second your response time exceeds your
> > > requirements. Do NOT shard at this point. Add another replica instead.
> > > Sharding adds inevitable overhead and should only be considered when
> > > you can’t get adequate response time even under fairly light query
> loads
> > > as a general rule.
> > >
> > > Best,
> > > Erick
> > >
> > > > On Apr 16, 2020, at 12:08 PM, Revas <revas2...@gmail.com> wrote:
> > > >
> > > > Hi Erick, You are correct, we have only about 1.8M documents so far
> and
> > > > turning on the indexing on the facet fields helped improve the
> timings
> > of
> > > > the facet query a lot which has (sub facets and facet queries). So
> does
> > > > docValues help at all for sub facets and facet query, our tests
> > > > revealed further query time improvement when we turned off the
> > docValues.
> > > > is that the right approach?
> > > >
> > > > Currently we have only 1 shard and  we are thinking of scaling by
> > > > increasing the number of shards when we see a deterioration on query
> > > time.
> > > > Any suggestions?
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > On Wed, Apr 15, 2020 at 8:21 AM Erick Erickson <
> > erickerick...@gmail.com>
> > > > wrote:
> > > >
> > > >> In a word, “yes”. I also suspect your corpus isn’t very big.
> > > >>
> > > >> I think the key is the facet queries. Now, I’m talking from
> > > >> theory rather than diving into the code, but querying on
> > > >> a docValues=true, indexed=false field is really doing a
> > > >> search. And searching on a field like that is effectively
> > > >> analogous to a table scan. Even if somehow an internal
> > > >> structure would be constructed to deal with it, it would
> > > >> probably be on the heap, where you don’t want it.
> > > >>
> > > >> So the test would be to take the queries out and measure
> > > >> performance, but I think that’s the root issue here.
> > > >>
> > > >> Best,
> > > >> Erick
> > > >>
> > > >>> On Apr 14, 2020, at 11:51 PM, Revas <revas2...@gmail.com> wrote:
> > > >>>
> > > >>> We have faceting fields that have been defined as indexed=false,
> > > >>> stored=false and docValues=true
> > > >>>
> > > >>> However we use a lot of subfacets  using  json facets and facet
> > ranges
> > > >>> using facet.queries. We see that after every soft-commit our
> > > performance
> > > >>> worsens and performs ideal between commits
> > > >>>
> > > >>> how is that docValue fields are affected by soft-commit and do we
> > need
> > > to
> > > >>> enable indexing if we use subfacets and facet query to improve
> > > >> performance?
> > > >>>
> > > >>> Tha
> > > >>
> > > >>
> > >
> > >
> >
>

Re: facets & docValues

Reply via email to