My index has a number of shards that are nearly static, each with about
7 million documents. By nearly static, I mean that the only changes
that normally happen to them are document deletions, done with the xml
update handler. The process that does these deletions runs once every
two minutes, and does them with a query on a field other than the one
that's used for uniqueKey. Once a day, I will be adding data to these
indexes with the DIH delta-import. One of my shards gets all new data
once every two minutes, but it is less than 5% the size of the others.
The problem that I'm running into is that every time a delete is
committed, my caches are suddenly invalid and I seem to have two
options: Spend a lot of time and I/O rewarming them, or suffer with slow
(3 seconds or longer) search times. Is there any way to have the index
keep its caches when the only thing that happens is deletions, then
invalidate them when it's time to actually add data? It would have to
be something I can dynamically change when switching between deletions
and the daily import.
Thanks,
Shawn
- Solr caches and nearly static indexes Shawn Heisey
-