On 7/30/2014 10:00 AM, Shawn Heisey wrote: > It may turn out that this is actually a bug in merging, where old > segments are not getting deleted. I noticed in the optimized index that > there is a single large segment of about 20GB and a bunch of other > segments that are all older than the single large segment. I'm manually > optimizing that index again to see what happens. I'll probably need do > the rebuild again with infoStream enabled.
The second optimize did not delete those old segments. I also did an optimize on another shard, and saw the same problem there. A full rebuild will take close to twelve hours, or possibly longer once I enable infoStream. I will open an issue, and if the problem persists, attach the infoStream. This problem likely does not affect the actual size of the index loaded into Lucene, just the amount of disk space taken, though I cannot confirm that statement. Has anyone else noticed an increase in the size of on-disk indexes after an upgrade to 4.9, or the presence of older segments after an optimize (forceMerge)? My rebuilds use DIH, in case that matters. One possible trigger might be that I am indexing into an index directory originally built a previous Solr version. Before I start the dataimport, I do delete all docs and issue a commit, but I am not deleting the entire index directory. Thanks, Shawn