On 12/3/2014 4:35 AM, Alexey Kozhemiakin wrote: > We have a high percentage of deleted docs which do not go away because there > are several huge ancient segments that do not merge with anything else > naturally. Our use case in constant reindexing of same data - ~100 gb, 12 000 > 000 real records, 20 000 000 total records in index, which is ~80% deletes.
The "normal" way to deal with this is to simply optimize the index, which you can do with the click of a button in the admin UI on 4.x. It is likely to take an hour or so with 100GB of data unless your disk subsystem is *extremely* fast, but I believe with version 4.x you can even continue to update the index while it's optimizing. It will also cause a lot of I/O, which might hurt performance, so you'd want to do it during a non-peak time. The list archives include a lot of talk about optimizes being unnecessary in newer versions ... but wiping out deleted documents is still a major use case for the feature. Thanks, Shawn