Hendrik: There's one problem with IndexUpgraderTool. As Shawn points out, it does a forceMerge, which by default creates one large segment. This has some implications in terms of the number of deleted documents if the index has updates afterwards, see:
https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ and the associated JIRA: https://issues.apache.org/jira/browse/LUCENE-7976 My recommendation would be to _not_ run the IndexUpgraderTool and let background merging do what's necessary over time. Or, as Shawn says, re-index from scratch. Exceptions: 1> your index is less than 5g. Since that's the default max segment size (see the article), it won't matter. 2> you optimize frequently anyway 3> you _might_ getaway with a forceMerge where you specify the number of segments to create is (index_size_in_gigabytes/5g). But frankly I don't know enough about the algorithm for how segments are chosen in that case to know whether that'd do exactly what you want. Best, Erick On Wed, Mar 14, 2018 at 10:08 AM, Hendrik Haddorp <hendrik.hadd...@gmx.net> wrote: > Thanks for the detailed description! > > > On 14.03.2018 16:11, Shawn Heisey wrote: >> >> On 3/14/2018 5:56 AM, Hendrik Haddorp wrote: >>> >>> So you are saying that we do not need to run the IndexUpgrader tool if we >>> move from 6 to 7. Will the index be then updated automatically or will we >>> get a problem once we move to 8? >> >> >> If you don't run IndexUpgrader, and the index version is one that the new >> Solr can read, then existing index segments will remain in the format they >> are. New segments will be written in the new format. If any of the >> existing segments are merged, then the new larger segment will be in the new >> format. >> >> Summary: If an index starts out as 6.x, then is run for a while in 7.x, >> but there are still 6.x segments left, then that index will not work in 8.0. >> >> IndexUpgrader is a Lucene tool. This tool just runs a forceMerge process >> on the index, which will merge all of the existing segments into a single >> segment. It's EXACTLY the same operation that Solr calls "optimize". >> (Lucene used to call it optimize too. Then they renamed it.) >> >>> How would one use the IndexUpgrader at all with Solr? Would one need to >>> run it against the index of every core? >> >> >> The Solr server must be shut down during the IndexUpgrader run. >> IndexUpgrader is a completely separate tool, part of Lucene. It has zero >> knowledge of anything that you have configured in Solr, so you must locate >> the index directory of any core you want to upgrade and run the tool on that >> index directory. >> >> Thanks, >> Shawn >> >