Re: Solr Deletes

2020-05-29 Thread Cassandra Targett
posts and testing > (https://lucidworks.com/post/really-batch-updates-solr-2/). > > > > Again thanks to the community and users for everyone’s contribution on the > issue it is very much appreciated. > > > Successful Solr-ing to all, > > > Dwane > > _

Re: Solr Deletes

2020-05-26 Thread Dwane Hall
s.com/post/really-batch-updates-solr-2/). Again thanks to the community and users for everyone’s contribution on the issue it is very much appreciated. Successful Solr-ing to all, Dwane From: Bram Van Dam Sent: Wednesday, 27 May 2020 5:34 AM To: solr-user@lu

Re: Solr Deletes

2020-05-26 Thread Bram Van Dam
On 26/05/2020 14:07, Erick Erickson wrote: > So best practice is to go ahead and use delete-by-id. I've noticed that this can cause issues when using implicit routing, at least on 7.x. Though I can't quite remember whether the issue was a performance issue, or whether documents would sometimes n

Re: Solr Deletes

2020-05-26 Thread Erick Erickson
Dwane: DBQ for very large deletes is “iffy”. The problem is this: Solr must lock out _all_ indexing for _all_ replicas while the DBQ runs and this can just take a long time. This is just a consequence of distributed computing. Imagine a scenario where one of the documents affected by the DBQ is

Re: Solr Deletes

2020-05-26 Thread Emir Arnautović
Hi Dwane, DBQ does not play well with concurrent updates - it’ll block updates on replicas causing replicas to fall behind, trigger full replication and potentially OOM. My advice is to go with cursors (or even better use some DB as source of IDs) and DBID with some batching. You’ll need some te

Solr Deletes

2020-05-25 Thread Dwane Hall
Hey Solr users, I'd really appreciate some community advice if somebody can spare some time to assist me. My question relates to initially deleting a large amount of unwanted data from a Solr Cloud collection, and then advice on best patterns for managing delete operations on a regular basis