I have a Near-Realtime Search implementation on Solr Cloud 7.5 and I'm having an issue with deleting documents from a sharded collection.
I'm deleting documents right now using a query for the document ID, and everything seems to be working properly, aside from the fact that deletes are *really slow*. Delete request come in via a message queue, and overnight when most of our document removals happen the queue backs up, sometimes taking several hours to clear a few thousand documents. { "delete": { "query": "+id:12345678" } } We've seen suggestions that document deletes by ID are preferred due to performance, so I built an implementation that works correctly with a single unsharded core on a standalone Solr server. But when I try deletes by ID on our sharded development SolrCloud cluster the deletes are unreliable. It's not clear to me how they happen, if at all. I've had some luck sending the delete directly to the node where the document lives, but even that doesn't seem to work consistently. The cluster is sharded using the compositeId router on a second field common to all of our records if that matters. { "delete": { "id": "12345678" } } Does anyone have any advice about what *should* work? Is there some contributing factor I'm missing, like how we're sharding? We're trying to move away from full DataImporHandler reindexes in the future as a means of removing deleted documents, but we need to be able to delete specific documents directly in an efficient way before this could be a reality.