Do not, repeat NOT expungeDelete after each deleteByQuery, it is
a very expensive operation. Perhaps after the nightly batch, but
I doubt that’ll help much anyway.

30% deleted docs is quite normal, and should definitely not 
change the response time by a factor of 100! So there’s
some other issue in your environment.

So the things I’d check:
1> the schema is exactly the same. It’s vaguely possible that
     the schema is just a tiny bit different. If that’s the case, you
     need to delete your entire collection’s data and re-index from
     scratch. You can index to a new collection and use
     collection aliasing to do this seamlessly

2> Your solrconfig is exactly the same, especially the filterCache
     cache settings. I call out filterCache because you specifically
     mention filter queries, but check your other caches too.

3> Check your filterCache usage statistics. If you see drastically
     different hit ratios in the two environments, you need to pursue that.

4> Once and always, check your GC performance on the two
     environments. It’s a low-probability item, but you may be
    just enough different in prod that GC is an issue.

5> Take a look at the QTimes recorded in your solr logs to insure
    that the difference isn’t outside of Solr.

While I can’t say what the exact problem is, I’m 99% sure that the number
of deleted docs isn’t the culprit.

Best,
Erick

> On May 9, 2020, at 6:22 PM, Ganesh Sethuraman <ganeshmail...@gmail.com> wrote:
> 
> Hi Solr Users,
> 
> We use SolrCloud 7.2.1 with 2 Solr nodes in AWS. The shard size for these
> collections does not exceed more than 5G. They have approximately 16 shards
> with 2 replicas.  We do deletes (ByQuery) as well large updates in some of
> these Solr collections. We are seeing slower filter queries (95% > 10secs)
> on these collections in production, same collections, and same queries in
> our lower environment with similar setup and configuration we seeing much
> better performance (<100ms).  These are NRT indexes, with daily batch
> updates only.
> 
> We see a difference however in the lower environment; that we don't see
> updates or deletes, we see in Segment Info for each of the Solr code there
> are ZERO delete percentages.  Could this be the reason for the faster query
> response time in our lower environment? in our production environment, we
> are seeing about 30-32% of deletes in each core shard/replica pair.
> 
> Does this segment delete % has any correlation with query response time? We
> do delete by Query in a loop. Also updates.
> If it is so, do you suggest to try to do Optimize or expungeDelete at the
> end every day?
> Do we need to expunge delete after each delete ByQuery or do it once at the
> end?
> 
> Regards,
> Ganesh

Reply via email to