> On May 14, 2019, at 7:46 PM, Adam Walz <a...@adamwalz.net> wrote: > > but do > use an external map reduce process to reindex
Here’s where I’d look then. Not knowing any details of your process this may be totally wrong of course…. If there’s any step that performs a MERGEINDEX operation, _and_ somehow the same <uniqueKey> got indexed to the sub-indexes that are being merged, then there’s no deduplication on and you will have multiple docs with the same <uniqueKey>. I strongly suspect that that, or something similar, is happening. That’s how MapReduceIndexerTool operated, there were N sub-indexed produced totally independently and then a MERGEINDEX operation happened on a per-shard basis. Or something unexpected like there being no <uniqueKey> defined in the schema somehow. I have never of Solr failing to remove old documents when a new one with the same ID is being indexed without something like the above being the problem. One bit of background: Lucene has no notion of <uniqueKey>, that is totally a Solr construct and is up to Solr to enforce. So anything that bypasses Solr could produces this… FWIW, Erick