On 7/12/2016 9:45 AM, Jason wrote: > I'm using optimize because it's a option for fast search. Our index > updates one or more weekly. If I don't use optimize, many index files > should be kept. Any performance issues in that case? And I'm wondering > relation between index file size and heap size. In case of running as > master server that only update index, is there any guide for heap size > include Xmx, NewSize, MaxNewSize, etc.?
In older (2.x and 3.x) versions of Lucene, optimizing an index would make a huge difference in performance. In modern versions, the performance increase from an optimize is much less dramatic. Lucene (and by extension, Solr) has gotten very good at dealing with an index comprised of many segments. The recommendation for the last few years has been to AVOID doing an optimize unless it can be done during times of very low query traffic, when the I/O load will not cause issues. About the only good reason left for frequent optimizes is when the index has many updates to existing documents, resulting in a very large percentage of deleted documents in the index. In that case, the optimize will shrink the overall index size, which will make it faster and make relevancy more accurate. There is no general information available for setting the heap size. There is also no general information available on "acceptable" index size. The following wiki page touches a little bit on the heap size topic: https://wiki.apache.org/solr/SolrPerformanceProblems The reason that there is no generic information available is covered here: https://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/ Thanks, Shawn