My index has a number of shards that are nearly static, each with about 7 million documents. By nearly static, I mean that the only changes that normally happen to them are document deletions, done with the xml update handler. The process that does these deletions runs once every two minutes, and does them with a query on a field other than the one that's used for uniqueKey. Once a day, I will be adding data to these indexes with the DIH delta-import. One of my shards gets all new data once every two minutes, but it is less than 5% the size of the others.

The problem that I'm running into is that every time a delete is committed, my caches are suddenly invalid and I seem to have two options: Spend a lot of time and I/O rewarming them, or suffer with slow (3 seconds or longer) search times. Is there any way to have the index keep its caches when the only thing that happens is deletions, then invalidate them when it's time to actually add data? It would have to be something I can dynamically change when switching between deletions and the daily import.

Thanks,
Shawn

Reply via email to