On 1/8/2018 10:17 PM, kshitij tyagi wrote:
1. Does in place updates opens a new searcher by itself or not? 2. As the entire segment is rewriten, it means that frequent in place updates are expensive as each in place update will rewrite the entire segment again? Correct me here if my understanding is not correct.
Opening a new searcher is not related to the update. It's something that happens at commit time, if the commit has openSearcher=true (which is the default setting).
In-place updates don't rewrite the entire segment, they only rewrite part of the docValues information for the segment -- only the portion for the fields that got updated. The information is written into a new file, and the original file is untouched.
If there are multiple fields with docValues and not all of them are updated, then it would not be possible to delete the old file until the segment gets merged. I am not sure about what happens if *every* field with docValues is eligible for in-place updates and all of them get updated. If that were the case, then it would be possible to have an optimization that removes the old docValues file, but I have no idea whether Lucene actually has that as an optimization. I would not expect most indexes to be eligible for the optimization even if Lucene can do it.
Yes, frequent in-place updates can be expensive, and can make the index larger, because the values in the updated field for every document in the segment will be written to a new file. If you never optimize the index and mostly update recently added documents, then the segments involved will probably be small, and performance would be pretty good.
Thanks, Shawn